Tag Archives: mathematics

Einstein-Years

Scott Aaronson recently published an interesting exchange on his blog Shtetl Optimized, between him and cognitive psychologist Steven Pinker. The conversation was about AI: Aaronson is optimistic (though not insanely so) Pinker is pessimistic (again, not insanely though). While fun reading, the whole thing would normally be a bit too off-topic for this blog, except that Aaronson’s argument ended up invoking something I do know a bit about: how we make progress in theoretical physics.

Aaronson was trying to respond to an argument of Pinker’s, that super-intelligence is too vague and broad to be something we could expect an AI to have. Aaronson asks us to imagine an AI that is nothing more or less than a simulation of Einstein’s brain. Such a thing isn’t possible today, and might not even be efficient, but it has the advantage of being something concrete we can all imagine. Aarsonson then suggests imagining that AI sped up a thousandfold, so that in one year it covers a thousand years of Einstein’s thought. Such an AI couldn’t solve every problem, of course. But in theoretical physics, surely such an AI could be safely described as super-intelligent: an amazing power that would change the shape of physics as we know it.

I’m not as sure of this as Aaronson is. We don’t have a machine that generates a thousand Einstein-years to test, but we do have one piece of evidence: the 76 Einstein-years the man actually lived.

Einstein is rightly famous as a genius in theoretical physics. His annus mirabilis resulted in five papers that revolutionized the field, and the next decade saw his theory of general relativity transform our understanding of space and time. Later, he explored what general relativity was capable of and framed challenges that deepened our understanding of quantum mechanics.

After that, though…not so much. For Einstein-decades, he tried to work towards a new unified theory of physics, and as far as I’m aware made no useful progress at all. I’ve never seen someone cite work from that period of Einstein’s life.

Aarsonson mentions simulating Einstein “at his peak”, and it would be tempting to assume that the unified theory came “after his peak”, when age had weakened his mind. But while that kind of thing can sometimes be an issue for older scientists, I think it’s overstated. I don’t think careers peak early because of “youthful brains”, and with the exception of genuine dementia I don’t think older physicists are that much worse-off cognitively than younger ones. The reason so many prominent older physicists go down unproductive rabbit-holes isn’t because they’re old. It’s because genius isn’t universal.

Einstein made the progress he did because he was the right person to make that progress. He had the right background, the right temperament, and the right interests to take others’ mathematics and take them seriously as physics. As he aged, he built on what he found, and that background in turn enabled him to do more great things. But eventually, the path he walked down simply wasn’t useful anymore. His story ended, driven to a theory that simply wasn’t going to work, because given his experience up to that point that was the work that interested him most.

I think genius in physics is in general like that. It can feel very broad because a good genius picks up new tricks along the way, and grows their capabilities. But throughout, you can see the links: the tools mastered at one age that turn out to be just right for a new pattern. For the greatest geniuses in my field, you can see the “signatures” in their work, hints at why they were just the right genius for one problem or another. Give one a thousand years, and I suspect the well would eventually run dry: the state of knowledge would no longer be suitable for even their breadth.

…of course, none of that really matters for Aaronson’s point.

A century of Einstein-years wouldn’t have found the Standard Model or String Theory, but a century of physicist-years absolutely did. If instead of a simulation of Einstein, your AI was a simulation of a population of scientists, generating new geniuses as the years go by, then the argument works again. Sure, such an AI would be much more expensive, much more difficult to build, but the first one might have been as well. The point of the argument is simply to show such a thing is possible.

The core of Aaronson’s point rests on two key traits of technology. Technology is replicable: once we know how to build something, we can build more of it. Technology is scalable: if we know how to build something, we can try to build a bigger one with more resources. Evolution can tap into both of these, but not reliably: just because it’s possible to build a mind a thousand times better at some task doesn’t mean it will.

That is why the possibility of AI leads to the possibility of super-intelligence. If we can make a computer that can do something, we can make it do that something faster. That something doesn’t have to be “general”, you can have programs that excel at one task or another. For each such task, with more resources you can scale things up: so anything a machine can do now, a later machine can probably do better. Your starting-point doesn’t necessarily even have to be efficient, or a good algorithm: bad algorithms will take longer to scale, but could eventually get there too.

The only question at that point is “how fast?” I don’t have the impression that’s settled. The achievements that got Pinker and Aarsonson talking, GPT-3 and DALL-E and so forth, impressed people by their speed, by how soon they got to capabilities we didn’t expect them to have. That doesn’t mean that something we might really call super-intelligence is close: that has to do with the details, with what your target is and how fast you can actually scale. And it certainly doesn’t mean that another approach might not be faster! (As a total outsider, I can’t help but wonder if current ML is in some sense trying to fit a cubic with straight lines.)

It does mean, though, that super-intelligence isn’t inconceivable, or incoherent. It’s just the recognition that technology is a master of brute force, and brute force eventually triumphs. If you want to think about what happens in that “eventually”, that’s a very important thing to keep in mind.

The Most Anthropic of All Possible Worlds

Today, we’d call Leibniz a mathematician, a physicist, and a philosopher. As a mathematician, Leibniz turned calculus into something his contemporaries could actually use. As a physicist, he championed a doomed theory of gravity. In philosophy, he seems to be most remembered for extremely cheaty arguments.

Free will and determinism? Can’t it just be a coincidence?

I don’t blame him for this. Faced with a tricky philosophical problem, it’s enormously tempting to just blaze through with an answer that makes every subtlety irrelevant. It’s a temptation I’ve succumbed to time and time again. Faced with a genie, I would always wish for more wishes. On my high school debate team, I once forced everyone at a tournament to switch sides with some sneaky definitions. It’s all good fun, but people usually end up pretty annoyed with you afterwards.

People were annoyed with Leibniz too, especially with his solution to the problem of evil. If you believe in a benevolent, all-powerful god, as Leibniz did, why is the world full of suffering and misery? Leibniz’s answer was that even an all-powerful god is constrained by logic, so if the world contains evil, it must be logically impossible to make the world any better: indeed, we live in the best of all possible worlds. Voltaire famously made fun of this argument in Candide, dragging a Leibniz-esque Professor Pangloss through some of the most creative miseries the eighteenth century had to offer. It’s possibly the most famous satire of a philosopher, easily beating out Aristophanes’ The Clouds (which is also great).

Physicists can also get accused of cheaty arguments, and probably the most mocked is the idea of a multiverse. While it hasn’t had its own Candide, the multiverse has been criticized by everyone from bloggers to Nobel prizewinners. Leibniz wanted to explain the existence of evil, physicists want to explain “unnaturalness”: the fact that the kinds of theories we use to explain the world can’t seem to explain the mass of the Higgs boson. To explain it, these physicists suggest that there are really many different universes, separated widely in space or built in to the interpretation of quantum mechanics. Each universe has a different Higgs mass, and ours just happens to be the one we can live in. This kind of argument is called “anthropic” reasoning. Rather than the best of all possible worlds, it says we live in the world best-suited to life like ours.

I called Leibniz’s argument “cheaty”, and you might presume I think the same of the multiverse. But “cheaty” doesn’t mean “wrong”. It all depends what you’re trying to do.

Leibniz’s argument and the multiverse both work by dodging a problem. For Leibniz, the problem of evil becomes pointless: any evil might be necessary to secure a greater good. With a multiverse, naturalness becomes pointless: with many different laws of physics in different places, the existence of one like ours needs no explanation.

In both cases, though, the dodge isn’t perfect. To really explain any given evil, Leibniz would have to show why it is secretly necessary in the face of a greater good (and Pangloss spends Candide trying to do exactly that). To explain any given law of physics, the multiverse needs to use anthropic reasoning: it needs to show that that law needs to be the way it is to support human-like life.

This sounds like a strict requirement, but in both cases it’s not actually so useful. Leibniz could (and Pangloss does) come up with an explanation for pretty much anything. The problem is that no-one actually knows which aspects of the universe are essential and which aren’t. Without a reliable way to describe the best of all possible worlds, we can’t actually test whether our world is one.

The same problem holds for anthropic reasoning. We don’t actually know what conditions are required to give rise to people like us. “People like us” is very vague, and dramatically different universes might still contain something that can perceive and observe. While it might seem that there are clear requirements, so far there hasn’t been enough for people to do very much with this type of reasoning.

However, for both Leibniz and most of the physicists who believe anthropic arguments, none of this really matters. That’s because the “best of all possible worlds” and “most anthropic of all possible worlds” aren’t really meant to be predictive theories. They’re meant to say that, once you are convinced of certain things, certain problems don’t matter anymore.

Leibniz, in particular, wasn’t trying to argue for the existence of his god. He began the argument convinced that a particular sort of god existed: one that was all-powerful and benevolent, and set in motion a deterministic universe bound by logic. His argument is meant to show that, if you believe in such a god, then the problem of evil can be ignored: no matter how bad the universe seems, it may still be the best possible world.

Similarly, the physicists convinced of the multiverse aren’t really getting there through naturalness. Rather, they’ve become convinced of a few key claims: that the universe is rapidly expanding, leading to a proliferating multiverse, and that the laws of physics in such a multiverse can vary from place to place, due to the huge landscape of possible laws of physics in string theory. If you already believe those things, then the naturalness problem can be ignored: we live in some randomly chosen part of the landscape hospitable to life, which can be anywhere it needs to be.

So despite their cheaty feel, both arguments are fine…provided you agree with their assumptions. Personally, I don’t agree with Leibniz. For the multiverse, I’m less sure. I’m not confident the universe expands fast enough to create a multiverse, I’m not even confident it’s speeding up its expansion now. I know there’s a lot of controversy about the math behind the string theory landscape, about whether the vast set of possible laws of physics are as consistent as they’re supposed to be…and of course, as anyone must admit, we don’t know whether string theory itself is true! I don’t think it’s impossible that the right argument comes around and convinces me of one or both claims, though. These kinds of arguments, “if assumptions, then conclusion” are the kind of thing that seems useless for a while…until someone convinces you of the conclusion, and they matter once again.

So in the end, despite the similarity, I’m not sure the multiverse deserves its own Candide. I’m not even sure Leibniz deserved Candide. But hopefully by understanding one, you can understand the other just a bit better.

At New Ideas in Cosmology

The Niels Bohr Institute is hosting a conference this week on New Ideas in Cosmology. I’m no cosmologist, but it’s a pretty cool field, so as a local I’ve been sitting in on some of the talks. So far they’ve had a selection of really interesting speakers with quite a variety of interests, including a talk by Roger Penrose with his trademark hand-stippled drawings.

Including this old classic

One thing that has impressed me has been the “interdisciplinary” feel of the conference. By all rights this should be one “discipline”, cosmology. But in practice, each speaker came at the subject from a different direction. They all had a shared core of knowledge, common models of the universe they all compare to. But the knowledge they brought to the subject varied: some had deep knowledge of the mathematics of gravity, others worked with string theory, or particle physics, or numerical simulations. Each talk, aware of the varied audience, was a bit “colloquium-style“, introducing a framework before diving in to the latest research. Each speaker knew enough to talk to the others, but not so much that they couldn’t learn from them. It’s been unexpectedly refreshing, a real interdisciplinary conference done right.

At Mikefest

I’m at a conference this week of a very particular type: a birthday conference. When folks in my field turn 60, their students and friends organize a special conference for them, celebrating their research legacy. With COVID restrictions just loosening, my advisor Michael Douglas is getting a last-minute conference. And as one of the last couple students he graduated at Stony Brook, I naturally showed up.

The conference, Mikefest, is at the Institut des Hautes Études Scientifiques, just outside of Paris. Mike was a big supporter of the IHES, putting in a lot of fundraising work for them. Another big supporter, James Simons, was Mike’s employer for a little while after his time at Stony Brook. The conference center we’re meeting in is named for him.

You might have to zoom in to see that, though.

I wasn’t involved in organizing the conference, so it was interesting seeing differences between this and other birthday conferences. Other conferences focus on the birthday prof’s “family tree”: their advisor, their students, and some of their postdocs. We’ve had several talks from Mike’s postdocs, and one from his advisor, but only one from a student. Including him and me, three of Mike’s students are here: another two have had their work mentioned but aren’t speaking or attending.

Most of the speakers have collaborated with Mike, but only for a few papers each. All of them emphasized a broader debt though, for discussions and inspiration outside of direct collaboration. The message, again and again, is that Mike’s work has been broad enough to touch a wide range of people. He’s worked on branes and the landscape of different string theory universes, pure mathematics and computation, neuroscience and recently even machine learning. The talks generally begin with a few anecdotes about Mike, before pivoting into research talks on the speakers’ recent work. The recent-ness of the work is perhaps another difference from some birthday conferences: as one speaker said, this wasn’t just a celebration of Mike’s past, but a “welcome back” after his return from the finance world.

One thing I don’t know is how much this conference might have been limited by coming together on short notice. For other birthday conferences impacted by COVID (and I’m thinking of one in particular), it might be nice to have enough time to have most of the birthday prof’s friends and “academic family” there in person. As-is, though, Mike seems to be having fun regardless.

Happy Birthday Mike!

Things Which Are Fluids

For overambitious apes like us, adding integers is the easiest thing in the world. Take one berry, add another, and you have two. Each remains separate, you can lay them in a row and count them one by one, each distinct thing adding up to a group of distinct things.

Other things in math are less like berries. Add two real numbers, like pi and the square root of two, and you get another real number, bigger than the first two, something you can write in an infinite messy decimal. You know in principle you can separate it out again (subtract pi, get the square root of two), but you can’t just stare at it and see the parts. This is less like adding berries, and more like adding fluids. Pour some water in to some other water, and you certainly have more water. You don’t have “two waters”, though, and you can’t tell which part started as which.

More waters, please!

Some things in math look like berries, but are really like fluids. Take a polynomial, say 5 x^2 + 6 x + 8. It looks like three types of things, like three berries: five x^2, six x, and eight 1. Add another polynomial, and the illusion continues: add x^2 + 3 x + 2 and you get 6 x^2+9 x+10. You’ve just added more x^2, more x, more 1, like adding more strawberries, blueberries, and raspberries.

But those berries were a choice you made, and not the only one. You can rewrite that first polynomial, for example saying 5(x^2+2x+1) - 4 (x+1) + 7. That’s the same thing, you can check. But now it looks like five x^2+2x+1, negative four x+1, and seven 1. It’s different numbers of different things, blackberries or gooseberries or something. And you can do this in many ways, infinitely many in fact. The polynomial isn’t really a collection of berries, for all it looked like one. It’s much more like a fluid, a big sloshing mess you can pour into buckets of different sizes. (Technically, it’s a vector space. Your berries were a basis.)

Even smart, advanced students can get tripped up on this. You can be used to treating polynomials as a fluid, and forget that directions in space are a fluid, one you can rotate as you please. If you’re used to directions in space, you’ll get tripped up by something else. You’ll find that types of particles can be more fluid than berry, the question of which quark is which not as simple as how many strawberries and blueberries you have. The laws of physics themselves are much more like a fluid, which should make sense if you take a moment, because they are made of equations, and equations are like a fluid.

So my fellow overambitious apes, do be careful. Not many things are like berries in the end. A whole lot are like fluids.

W is for Why???

Have you heard the news about the W boson?

The W boson is a fundamental particle, part of the Standard Model of particle physics. It is what we call a “force-carrying boson”, a particle related to the weak nuclear force in the same way photons are related to electromagnetism. Unlike photons, W bosons are “heavy”: they have a mass. We can’t usually predict masses of particles, but the W boson is a bit different, because its mass comes from the Higgs boson in a special way, one that ties it to the masses of other particles like the Z boson. The upshot is that if you know the mass of a few other particles, you can predict the mass of the W.

And according to a recent publication, that prediction is wrong. A team analyzed results from an old experiment called the Tevatron, the biggest predecessor of today’s Large Hadron Collider. They treated the data with groundbreaking care, mindbogglingly even taking into account the shape of the machine’s wires. And after all that analysis, they found that the W bosons detected by the Tevatron had a different mass than the mass predicted by the Standard Model.

How different? Here’s where precision comes in. In physics, we decide whether to trust a measurement with a statistical tool. We calculate how likely the measurement would be, if it was an accident. In this case: how likely it would be that, if the Standard Model was correct, the measurement would still come out this way? To discover a new particle, we require this chance to be about one in 3.5 million, or in our jargon, five sigma. That was the requirement for discovering the Higgs boson. This super-precise measurement of the W boson doesn’t have five sigma…it has seven sigma. That means, if we trust the analysis team, then a measurement like this could come accidentally out of the Standard Model only about one in a trillion times.

Ok, should we trust the analysis team?

If you want to know that, I’m the wrong physicist to ask. The right physicists are experimental particle physicists. They do analyses like that one, and they know what can go wrong. Everyone I’ve heard from in that field emphasized that this was a very careful group, who did a lot of things impressively right…but there is still room for mistakes. One pointed out that the new measurement isn’t just inconsistent with the Standard Model, but with many previous measurements too. Those measurements are less precise, but still precise enough that we should be a bit skeptical. Another went into more detail about specific clues as to what might have gone wrong.

If you can’t find an particle experimentalist, the next best choice is a particle phenomenologist. These are the people who try to make predictions for new experiments, who use theoretical physics to propose new models that future experiments can test. Here’s one giving a first impression, and discussing some ways to edit the Standard Model to agree with the new measurement. Here’s another discussing what to me is an even more interesting question: if we take these measurements seriously, both the new one and the old ones, then what do we believe?

I’m not an experimentalist or a phenomenologist. I’m an “amplitudeologist”. I work not on the data, or the predictions, but the calculational tools used to make those predictions, called “scattering amplitudes”. And that gives me a different view on the situation.

See in my field, precision is one of our biggest selling-points. If you want theoretical predictions to match precise experiments, you need our tricks to compute them. We believe (and argue to grant agencies) that this precision will be important: if a precise experiment and a precise prediction disagree, it could be the first clue to something truly new. New solid evidence of something beyond the Standard Model would revitalize all of particle physics, giving us a concrete goal and killing fruitless speculation.

This result shakes my faith in that a little. Probably, the analysis team got something wrong. Possibly, all previous analyses got something wrong. Either way, a lot of very careful smart people tried to estimate their precision, got very confident…and got it wrong.

(There’s one more alternative: maybe million-to-one chances really do crop up nine times out of ten.)

If some future analysis digs down deep in precision, and finds another deviation from the Standard Model, should we trust it? What if it’s measuring something new, and we don’t have the prior experiments to compare to?

(This would happen if we build a new even higher-energy collider. There are things the collider could measure, like the chance one Higgs boson splits into two, that we could not measure with any earlier machine. If we measured that, we couldn’t compare it to the Tevatron or the LHC, we’d have only the new collider to go on.)

Statistics are supposed to tell us whether to trust a result. Here, they’re not doing their job. And that creates the scary possibility that some anomaly shows up, some real deviation deep in the sigmas that hints at a whole new path for the field…and we just end up bickering about who screwed it up. Or the equally scary possibility that we find a seven-sigma signal of some amazing new physics, build decades of new theories on it…and it isn’t actually real.

We don’t just trust statistics. We also trust the things normal people trust. Do other teams find the same result? (I hope that they’re trying to get to this same precision here, and see what went wrong!) Does the result match other experiments? Does it make predictions, which then get tested in future experiments?

All of those are heuristics of course. Nothing can guarantee that we measure the truth. Each trick just corrects for some of our biases, some of the ways we make mistakes. We have to hope that’s good enough, that if there’s something to see we’ll see it, and if there’s nothing to see we won’t. Precision, my field’s raison d’être, can’t be enough to convince us by itself. But it can help.

The Undefinable

If I can teach one lesson to all of you, it’s this: be precise. In physics, we try to state what we mean as precisely as we can. If we can’t state something precisely, that’s a clue: maybe what we’re trying to state doesn’t actually make sense.

Someone recently reached out to me with a question about black holes. He was confused about how they were described, about what would happen when you fall in to one versus what we could see from outside. Part of his confusion boiled down to a question: “is the center really an infinitely small point?”

I remembered a commenter a while back who had something interesting to say about this. Trying to remind myself of the details, I dug up this question on Physics Stack Exchange. user4552 has a detailed, well-referenced answer, with subtleties of General Relativity that go significantly beyond what I learned in grad school.

According to user4552, the reason this question is confusing is that the usual setup of general relativity cannot answer it. In general relativity, singularities like the singularity in the middle of a black hole aren’t treated as points, or collections of points: they’re not part of space-time at all. So you can’t count their dimensions, you can’t see whether they’re “really” infinitely small points, or surfaces, or lines…

This might surprise people (like me) who have experience with simpler equations for these things, like the Schwarzchild metric. The Schwarzchild metric describes space-time around a black hole, and in the usual coordinates it sure looks like the singularity is at a single point where r=0, just like the point where r=0 is a single point in polar coordinates in flat space. The thing is, though, that’s just one sort of coordinates. You can re-write a metric in many different sorts of coordinates, and the singularity in the center of a black hole might look very different in those coordinates. In general relativity, you need to stick to things you can say independent of coordinates.

Ok, you might say, so the usual mathematics can’t answer the question. Can we use more unusual mathematics? If our definition of dimensions doesn’t tell us whether the singularity is a point, maybe we just need a new definition!

According to user4552, people have tried this…and it only sort of works. There are several different ways you could define the dimension of a singularity. They all seem reasonable in one way or another. But they give different answers! Some say they’re points, some say they’re three-dimensional. And crucially, there’s no obvious reason why one definition is “right”. The question we started with, “is the center really an infinitely small point?”, looked like a perfectly reasonable question, but it actually wasn’t: the question wasn’t precise enough.

This is the real problem. The problem isn’t that our question was undefined, after all, we can always add new definitions. The problem was that our question didn’t specify well enough the definitions we needed. That is why the question doesn’t have an answer.

Once you understand the difference, you see these kinds of questions everywhere. If you’re baffled by how mass could have come out of the Big Bang, or how black holes could radiate particles in Hawking radiation, maybe you’ve heard a physicist say that energy isn’t always conserved. Energy conservation is a consequence of symmetry, specifically, symmetry in time. If your space-time itself isn’t symmetric (the expanding universe making the past different from the future, a collapsing star making a black hole), then you shouldn’t expect energy to be conserved.

I sometimes hear people object to this. They ask, is it really true that energy isn’t conserved when space-time isn’t symmetric? Shouldn’t we just say that space-time itself contains energy?

And well yes, you can say that, if you want. It isn’t part of the usual definition, but you can make a new definition, one that gives energy to space-time. In fact, you can make more than one new definition…and like the situation with the singularity, these definitions don’t always agree! Once again, you asked a question you thought was sensible, but it wasn’t precise enough to have a definite answer.

Keep your eye out for these kinds of questions. If scientists seem to avoid answering the question you want, and keep answering a different question instead…it might be their question is the only one with a precise answer. You can define a method to answer your question, sure…but it won’t be the only way. You need to ask precise enough questions to get good answers.

Geometry and Geometry

Last week, I gave the opening lectures for a course on scattering amplitudes, the things we compute to find probabilities in particle physics. After the first class, one of the students asked me if two different descriptions of these amplitudes, one called CHY and the other called the amplituhedron, were related. There does happen to be a connection, but it’s a bit subtle and indirect, not the sort of thing the student would have been thinking of. Why then, did he think they might be related? Well, he explained, both descriptions are geometric.

If you’ve been following this blog for a while, you’ve seen me talk about misunderstandings. There are a lot of subtle ways a smart student can misunderstand something, ways that can be hard for a teacher to recognize. The right question, or the right explanation, can reveal what’s going on. Here, I think the problem was that there are multiple meanings of geometry.

One of the descriptions the student asked about, CHY, is related to string theory. It describes scattering particles in terms of the path of a length of string through space and time. That path draws out a surface called a world-sheet, showing all the places the string touches on its journey. And that picture, of a wiggly surface drawn in space and time, looks like what most people think of as geometry: a “shape” in a pretty normal sense, which here describes the physics of scattering particles.

The other description, the amplituhedron, also uses geometric objects to describe scattering particles. But the “geometric objects” here are much more abstract. A few of them are familiar: straight lines, the area between them forming shapes on a plane. Most of them, though are generalizations of this: instead of lines on a plane, they have higher dimensional planes in higher dimensional spaces. These too get described as geometry, even though they aren’t the “everyday” geometry you might be familiar with. Instead, they’re a “natural generalization”, something that, once you know the math, is close enough to that “everyday” geometry that it deserves the same name.

This week, two papers presented a totally different kind of geometric description of particle physics. In those papers, “geometric” has to do with differential geometry, the mathematics behind Einstein’s theory of general relativity. The descriptions are geometric because they use the same kinds of building-blocks of that theory, a metric that bends space and time. Once again, this kind of geometry is a natural generalization of the everyday notion, but now in once again a different way.

All of these notions of geometry do have some things in common, of course. Maybe you could even write down a definition of “geometry” that includes all of them. But they’re different enough that if I tell you that two descriptions are “geometric”, it doesn’t tell you all that much. It definitely doesn’t tell you the two descriptions are related.

It’s a reasonable misunderstanding, though. It comes from a place where, used to “everyday” geometry, you expect two “geometric descriptions” of something to be similar: shapes moving in everyday space, things you can directly compare. Instead, a geometric description can be many sorts of shape, in many sorts of spaces, emphasizing many sorts of properties. “Geometry” is just a really broad term.

The Unpublishable Dirty Tricks of Theoretical Physics

As the saying goes, it is better not to see laws or sausages being made. You’d prefer to see the clean package on the outside than the mess behind the scenes.

The same is true of science. A good paper tells a nice, clean story: a logical argument from beginning to end, with no extra baggage to slow it down. That story isn’t a lie: for any decent paper in theoretical physics, the conclusions will follow from the premises. Most of the time, though, it isn’t how the physicist actually did it.

The way we actually make discoveries is messy. It involves looking for inspiration in all the wrong places: pieces of old computer code and old problems, trying to reproduce this or that calculation with this or that method. In the end, once we find something interesting enough, we can reconstruct a clearer, cleaner, story, something actually fit to publish. We hide the original mess partly for career reasons (easier to get hired if you tell a clean, heroic story), partly to be understood (a paper that embraced the mess of discovery would be a mess to read), and partly just due to that deep human instinct to not let others see us that way.

The trouble is, some of that “mess” is useful, even essential. And because it’s never published or put into textbooks, the only way to learn it is word of mouth.

A lot of these messy tricks involve numerics. Many theoretical physics papers derive things analytically, writing out equations in symbols. It’s easy to make a mistake in that kind of calculation, either writing something wrong on paper or as a bug in computer code. To correct mistakes, many things are checked numerically: we plug in numbers to make sure everything still works. Sometimes this means using an approximation, trying to make sure two things cancel to some large enough number of decimal places. Sometimes instead it’s exact: we plug in prime numbers, and can much more easily see if two things are equal, or if something is rational or contains a square root. Sometimes numerics aren’t just used to check something, but to find a solution: exploring many options in an easier numerical calculation, finding one that works, and doing it again analytically.

“Ansatze” are also common: our fancy word for an educated guess. These we sometimes admit, when they’re at the core of a new scientific idea. But the more minor examples go un-mentioned. If a paper shows a nice clean formula and proves it’s correct, but doesn’t explain how the authors got it…probably, they used an ansatz. This trick can go hand-in-hand with numerics as well: make a guess, check it matches the right numbers, then try to see why it’s true.

The messy tricks can also involve the code itself. In my field we often use “computer algebra” systems, programs to do our calculations for us. These systems are programming languages in their own right, and we need to write computer code for them. That code gets passed around informally, but almost never standardized. Mathematical concepts that come up again and again can be implemented very differently by different people, some much more efficiently than others.

I don’t think it’s unreasonable that we leave “the mess” out of our papers. They would certainly be hard to understand otherwise! But it’s a shame we don’t publish our dirty tricks somewhere, even in special “dirty tricks” papers. Students often start out assuming everything is done the clean way, and start doubting themselves when they notice it’s much too slow to make progress. Learning the tricks is a big part of learning to be a physicist. We should find a better way to teach them.

Calculations of the Past

Last week was a birthday conference for one of the pioneers of my sub-field, Ettore Remiddi. I wasn’t there, but someone who was pointed me to some of the slides, including a talk by Stefano Laporta. For those of you who didn’t see my post a few weeks back, Laporta was one of Remiddi’s students, who developed one of the most important methods in our field and then vanished, spending ten years on an amazingly detailed calculation. Laporta’s talk covers more of the story, about what it was like to do precision calculations in that era.

“That era”, the 90’s through 2000’s, witnessed an enormous speedup in computers, and a corresponding speedup in what was possible. Laporta worked with Remiddi on the three-loop electron anomalous magnetic moment, something Remiddi had been working on since 1969. When Laporta joined in 1989, twenty-one of the seventy-two diagrams needed had still not been computed. They would polish them off over the next seven years, before Laporta dove in to four loops. Twenty years later, he had that four-loop result to over a thousand digits.

One fascinating part of the talk is seeing how the computational techniques change over time, as language replaces language and computer clusters get involved. As a student, Laporta learns a lesson we all often need: that to avoid mistakes, we need to do as little by hand as possible, even for something as simple as copying a one-line formula. Looking at his review of others’ calculations, it’s remarkable how many theoretical results had to be dramatically corrected a few years down the line, and how much still might depend on theoretical precision.

Another theme was one of Remiddi suggesting something and Laporta doing something entirely different, and often much more productive. Whether it was using the arithmetic-geometric mean for an elliptic integral instead of Gaussian quadrature, or coming up with his namesake method, Laporta spent a lot of time going his own way, and Remiddi quickly learned to trust him.

There’s a lot more in the slides that’s worth reading, including a mention of one of this year’s Physics Nobelists. The whole thing is an interesting look at what it takes to press precision to the utmost, and dedicate years to getting something right.