“Super” Computers: Using a Cluster

When I join a new department or institute, the first thing I ask is “do we have a cluster?”

Most of what I do, I do on a computer. Gone are the days when theorists would always do all their work on notepads and chalkboards (though many still do!). Instead, we use specialized computer programs like Mathematica and Maple. Using a program helps keep us from forgetting pesky minus signs, and it allows working with equations far too long to fit on a sheet of paper.

Now if computers help, more computer should help more. Since physicists like to add “super” to things, what about a supercomputer?

The Jaguars of the computing world.

Supercomputers are great, but they’re also expensive. The people who use supercomputers are the ones who model large, complicated systems, like the weather, or supernovae. For most theorists, you still want power, but you don’t need quite that much. That’s where computer clusters come in.

A computer cluster is pretty much what it sounds like: several computers wired together. Different clusters contain different numbers of computers. For example, my department has a ten-node cluster. Sure, that doesn’t stack up to a supercomputer, but it’s still ten times as fast as an ordinary computer, right?

The power of ten computers!

The power of ten computers!

Well, not exactly. As several of my friends have been surprised to learn, the computers on our cluster are actually slower than most of our laptops.

The power of ten old computers!

The power of ten old computers!

Still, ten older computers is still faster than one new one, yes?

Even then, it depends how you use it.

Run a normal task on a cluster, and it’s just going to run on one of the computers, which, as I’ve said, are slower than a modern laptop. You need to get smarter.

There are two big advantages of clusters: time, and parallelization.

Sometimes, you want to do a calculation that will take a long time. Your computer is going to be busy for a day or two, and that’s inconvenient when you want to do…well, pretty much anything else. A cluster is a space to run those long calculations. You put the calculation on one of the nodes, you go back to doing your work, and you check back in a day or two to see if it’s finished.

Clusters are at their most powerful when you can parallelize. If you need to do ten versions of the same calculation, each slightly different, then rather than doing them one at a time a cluster lets you do them all at once. At that point, it really is making you ten times faster.

If you ever program, I’d encourage you to look into the resources you have available. A cluster is a very handy thing to have access to, no matter what you’re doing!

A Wild Infinity Appears! Or, Renormalization

Back when Numberphile’s silly video about the zeta function came up, I wrote a post explaining the process of regularization, where physicists take an incorrect infinite result and patch it over to get something finite. At the end of that post I mentioned a particular variant of regularization, called renormalization, which was especially important in quantum field theory.

Renormalization has to do with how we do calculations and make predictions in particle physics. If you haven’t read my post “What’s so hard about Quantum Field Theory anyway?” you should read it before trying to tackle this one. The important concepts there are that probabilities in particle physics are calculated using Feynman Diagrams, that those diagrams consist of lines representing particles and points representing the ways they interact, that each line and point in the diagram gives a number that must be plugged in to the calculation, and that to do the full calculation you have to add up all the possible diagrams you can draw.

Let’s say you’re interested in finding out the mass of a particle. How about the Higgs?

You can’t weigh it, or otherwise see how gravity affects it: it’s much too light, and decays into other particles much too fast. Luckily, there is another way. As I mentioned in this post, a particle’s mass and its kinetic energy (energy of motion) both contribute to its total energy, which in turn affects what particles it can turn into if it decays. So if you want to find a particle’s mass, you need the relationship between its motion and its energy.

Suppose we’ve got a Higgs particle moving along. We know it was created out of some collision, and we know what it decays into at the end. With that, we can figure out its mass.

higgstree

There’s a problem here, though: we only know what happens at the beginning and the end of this diagram. We can’t be certain what happens in the middle. That means we need to add in all of the other diagrams, every possible diagram with that beginning and that end.

Just to look at one example, suppose the Higgs particle splits into a quark and an anti-quark (the antimatter version of the quark). If they come back together later into a Higgs, the process would look the same from the outside. Here’s the diagram for it:

higgsloop

When we’re “measuring the Higgs mass”, what we’re actually measuring is the sum of every single diagram that begins with the creation of a Higgs and ends with it decaying.

Surprisingly, that’s not the problem!

The problem comes when you try to calculate the number that comes out of that diagram, when the Higgs splits into a quark-antiquark pair. According to the rules of quantum field theory, those quarks don’t have to obey the normal relationship between total energy, kinetic energy, and mass. They can have any kinetic energy at all, from zero all the way up to infinity. And because it’s quantum field theory, you have to add up all of those possible kinetic energies, all the way up. In this case, the diagram actually gives you infinity.

(Note that not every diagram with unlimited kinetic energy is going to be infinite. The first time theorists calculated infinite diagrams, they were surprised.

For those of you who know calculus, the problem here comes after you integrate over momentum. The two quarks each give a factor of one over the momentum, and then you integrate the result four times (for three dimensions of space plus time), which gives an infinite result. If you had different particles arranged in a different way you might divide by more factors of momentum and get a finite value.)

The modern understanding of infinite results like this is that they arise from our ignorance. The mass of the Higgs isn’t actually infinity, because we can’t just add up every kinetic energy up to infinity. Instead, at some point before we get to infinity “something else” happens.

We don’t know what that “something else” is. It might be supersymmetry, it might be something else altogether. Whatever it is, we don’t know enough about it now to include it in the calculations as anything more than a cutoff, a point beyond which “something” happens. A theory with a cutoff like this, one that is only “effective” below a certain energy, is called an Effective Field Theory.

While we don’t know what happens at higher energies, we still need a way to complete our calculations if we want to use them in the real world. That’s where renormalization comes in.

When we use renormalization, we bring in experimental observations. We know that, no matter what is contributing to the Higgs particle’s mass, what we observe in the real world is finite. “Something” must be canceling the divergence, so we simply assume that “something” does, and that the final result agrees with the experiment!

"Something"

“Something”

In order to do this, we accepted the experimental result for the mass of the Higgs. That means that we’ve lost any ability to predict the mass from our theory. This is a general rule for renormalization: we trade ignorance (of the “something” that happens at high energy) for a loss of predictability.

If we had to do this for every calculation, we couldn’t predict anything at all. Luckily, for many theories (called renormalizable theories) there are theorems proving that you only need to do this a few times to fix the entire theory. You give up the ability to predict the results of a few experiments, but you gain the ability to predict the rest.

Luckily for us, the Standard Model is a renormalizable theory. Unfortunately, some important theories are not. In particular, quantum gravity is non-renormalizable. In order to fix the infinities in quantum gravity, you need to do the renormalization trick an infinite number of times, losing an infinite amount of predictability. Thus, while making a theory of quantum gravity is not difficult in principle, in practice the most obvious way to create the theory results in a “theory” that can never make any predictions.

One of the biggest virtues of string theory (some would say its greatest virtue) is that these infinities never appear. You never need to renormalize string theory in this way, which is what lets it work as a theory of quantum gravity. N=8 supergravity, the gravity cousin of N=4 super Yang-Mills, might also have this handy property, which is why many people are so eager to study it.

Why we Physics

There are a lot of good reasons to study theories in theoretical physics, even the ones that aren’t true. They teach us how to do calculations in other theories, including those that do describe reality, which lets us find out fundamental facts about nature. They let us hone our techniques, developing novel methods that often find use later, in some cases even spinoff technology. (Mathematica came out of the theoretical physics community, while experimental high energy physics led to the birth of the modern internet.)

Of course, none of this is why physicists actually do physics. Sure, Nima Arkani-Hamed might need to tell himself that space-time is doomed to get up in the morning, but for a lot of us, it isn’t about proving any wide-ranging point about the universe. It’s not even all about the awesome, as some would have it: most of what we do on a day-to-day basis isn’t especially awesome. It goes a bit deeper than that.

Science, in the end, is about solving puzzles. And solving puzzles is immensely satisfying, on a deep, fundamental level.

There’s a unique feeling that you get when all the pieces come together, when you’re calculating something and everything cancels and you’re left with a simple answer, and for some people that’s the best thing in existence.

It’s especially true when you’re working with an ansatz or using some other method where you fix parameters and fill in uncertainties, one by one. You can see how close you are to the answer, which means each step gives you that little thrill of getting just that much closer. One of my colleagues describes the calculations he does in supergravity as not tedious but “delightful” for precisely this reason: a calculation where every step puts another piece in the right place just feels good.

Theoretical physicists are the kind of people who would get a Lego set for their birthday, build it up to completion, and then never play with it again (unless it was to take it apart and make something else). We do it for the pure joy of seeing something come together and become complete. Save what it’s “for” for the grant committees, we’ve got a different rush in mind.

The Royal We of Theoretical Physics

I’m about to show you an abstract from a theoretical physics paper. Don’t worry about what it says, just observe the grammar.

wittenabstract

Notice anything? Here, I’ll zoom in:

wittenwe

This paper has one author, Edward Witten. So who’s “we”?

As it turns out, it is actually quite common in theoretical physics for a paper to use the word “we”, even when it is written by a single author. While this tradition has been called stilted, pompous, and just plain bad writing, there is a legitimate reason behind it. “We” is convenient, because it represents several different important things.

While the paper I quoted was written by only one author, many papers are collaborative efforts. For a collaboration, depending on collaboration style, it is often hard to distinguish who did what in a consistent way. As such, “we” helps smooth over different collaboration styles in a consistent way.

What about single-authored papers, though? For a single author, and often even for multiple authors, “we” means the author plus the reader.

In principle, anyone reading a paper in theoretical physics should be able to follow along, doing the calculations on their own, and replicate the paper’s results. In practice this can often be difficult to impossible, but it’s still true that if you want to really retain what you read in theoretical physics, you need to follow along and do some of the calculation yourself. As a nod to this, it is conventional to write theoretical physics papers as if the reader was directly participating, leading them through the results point by point like exercises in a textbook. “We” do one calculation, then “we” use the result to derive the next point, and so on.

There are other meanings that “we” can occasionally serve, such as referring to everyone in a particular field, or a group in a hypothetical example.

While each of these meanings of “we” could potentially use a different word, that tends to make a paper feel cluttered, with jarring transitions between different subjects. Using “we” for everything gives the paper a consistent voice and feel, though it does come at the cost of obscuring some of the specific details of who did what. Especially for collaborations, the “we the collaborators” and “we the author plus reader” meanings can overlap and blur together. This usually isn’t a problem, but as I’ve been finding out recently it does make things tricky when writing for people who aren’t theoretical physicists, such as universities with guidelines that require a thesis to clearly specify who in a collaboration did what.

On an unrelated note, two papers went up this week pushing the hexagon function story to new and impressive heights. I wasn’t directly involved in either, I’ve been attacking a somewhat different part of the problem, and you can look forward to something on that in a few months.

Caltech Amplitudes Workshop, and Valentines Poem 2014

This week’s post will be a short one. I’m at a small workshop for young amplitudes-folks at Caltech, so I’m somewhat busy.

(What we call a workshop is a small conference focused on fostering discussion and collaboration. While there are a few talks to give the workshop structure, most of the time is spent in more informal discussions between the participants.)

There have been a lot of great talks, and a lot of great opportunities to bond with fellow young amplitudeologists. Also, great workshop swag!

Yes, that is a Hot Wheels Mars Rover

Yes, that is a Hot Wheels Mars Rover

Unrelatedly, to continue a tradition from last year, and since it’s Valentine’s Day, allow me to present a short physics-themed poem I wrote a long time ago, this one about the sometimes counter-intuitive laws of thermodynamics:

Thermodynamic Hypothesis

A cold object, like a hot one, must be insulated

Cut off from interaction

Immerse the subject in a bath of warmth

And I reach equilibrium

What’s in a Thesis?

As I’ve mentioned before, I’m graduating this spring, which means I need to write that most foreboding of documents, the thesis. As I work on it, I’ve been thinking about how the nature of the thesis varies from field to field.

If you don’t have much experience with academics, you probably think of a thesis as a single, overarching achievement that structures a grad student’s career. A student enters grad school, designs an experiment, performs it, collects data, analyzes the data, draws some conclusion, then writes a thesis about it and graduates.

In some fields, the thesis really does work that way. In biology for example, the process of planning an experiment, setting it up, and analyzing and writing up the data can be just the right size so that, a reasonable percentage of the time, it really can all be done over the course of a PhD.

Other fields tend more towards smaller, faster-paced projects. In theoretical physics, mathematics, and computer science, most projects don’t have the same sort of large experimental overhead that psychologists or biologists have to deal with. The projects I’ve worked on are large-scale for theoretical physics, and I’ll still likely have worked on three distinct things before I graduate. Others, with smaller projects, will often have covered more.

In this situation, a thesis isn’t one overarching idea. Rather, it’s a compilation of work from past projects, sewed together with a pretense of an overall theme. It’s a bit messy, but because it’s the way things are expected to be done in these fields, no-one minds particularly much.

The other end of the spectrum is potentially much harder to deal with. For those who work on especially big experiments, the payoff might take longer to arrive than any reasonable degree. Big machines like colliders and particle detectors can take well over a decade before they start producing data, while longitudinal studies that follow a population as they grow and age take a long time no matter how fast you work.

In cases like this, the challenge is to chop off a small enough part of the project to make it feel like a thesis. A thesis could be written about designing one component for the eventual machine, or analyzing one part of the vast sea of data it produces. Preliminary data from a longitudinal study could be analyzed, even when the final results are many years down the line.

People in these fields have to be flexible and creative when it comes to creating a thesis, but usually the thesis committee is reasonable. In the end, a thesis is what you need to graduate, whatever that actually is for you.

Editors, Please Stop Misquoting Hawking

If you’ve been following science news recently, you’ve probably heard the apparently alarming news that Steven Hawking has turned his back on black holes, or that black holes can actually be escaped, or…how about I just show you some headlines:

FoxHawking

NatureHawking

YahooHawking

Now, Hawking didn’t actually say that black holes don’t exist, but while there are a few good pieces on the topic, in many cases the real message has gotten lost in the noise.

From Hawking’s paper:

ActualPaperHawking

What Hawking is proposing is that the “event horizon” around a black hole, rather than being an absolute permanent boundary from which nothing can escape, is a more temporary “apparent” horizon, the properties of which he goes on to describe in detail.

Why is he proposing this? It all has to do with the debate over black hole firewalls.

Starting with a paper by Polchinski and colleagues a year and a half ago, the black hole firewall paradox centers on contradictory predictions from general relativity and quantum mechanics. General relativity predicts that an astronaut falling past a black hole’s event horizon will notice nothing particularly odd about the surrounding space, but that once past the event horizon none of the “information” that specifies the astronaut’s properties can escape to the outside world. Quantum mechanics on the other hand predicts that information cannot be truly lost. The combination appears to suggest something radical, a “firewall” of high energy radiation around the event horizon carrying information from everything that fell into the black hole in the past, so powerful that it would burn our hypothetical astronaut to a crisp.

Since then, a wide variety of people have made one proposal or another, either attempting to avoid the seemingly preposterous firewall or to justify and further explain it. The reason the debate is so popular is because it touches on some of the fundamental principles of quantum mechanics.

Now, as I have pointed out before, I’m not a good person to ask about the fundamental principles of quantum mechanics. (Incidentally, I’d love it if some of the more quantum information or general relativity-focused bloggers would take a more substantial crack at this! Carroll, Preskill, anyone?) What I can talk about, though, is hype.

All of the headlines I listed take Hawking’s quote out of context, but not all of the articles do. The problem isn’t so much the journalists, as the editors.

One of an editor’s responsibilities is to take articles and give them titles that draw in readers. The editor wants a title that will get people excited, make them curious, and most importantly, get them to click. While a journalist won’t have any particular incentive to improve ad revenue, the same cannot be said for an editor. Thus, editors will often rephrase the title of an article in a way that makes the whole story seem more shocking.

Now that, in itself, isn’t a problem. I’ve used titles like that myself. The problem comes when the title isn’t just shocking, but misleading.

When I call astrophysics “impossible”, nobody is going to think I mean it literally. The title is petulant and ridiculous enough that no-one would take it at face value, but still odd enough to make people curious. By contrast, when you say that Hawking has “changed his mind” about black holes or said that “black holes do not exist”, there are people who will take that at face value as supporting their existing beliefs, as the Borowitz Report humorously points out. These people will go off thinking that Hawking really has given up on black holes. If the title confirms their beliefs enough, people might not even bother to read the article. Thus, by using an actively misleading title, you may actually be decreasing clicks!

It’s not that hard to write a title that’s both enough of a hook to draw people in and won’t mislead. Editors of the world, you’re well-trained writers, certainly much better than me. I’m sure you can manage it.

There really is some interesting news here, if people had bothered to look into it. The firewall debate has been going on for a year and a half, and while Hawking isn’t the universal genius the media occasionally depicts he’s still the world’s foremost expert on the quantum properties of black holes. Why did he take so long to weigh in? Is what he’s proposing even particularly new? I seem to remember people discussing eliminating the horizon in one way or another (even “naked” singularities) much earlier in the firewall debate…what makes Hawking’s proposal novel and different?

This is the sort of thing you can use to draw in interest, editors of the world. Don’t just write titles that cause ignorant people to roll their eyes and move on, instead, get people curious about what’s really going on in science! More ad revenue for you, more science awareness for us, sounds like a win-win!

How (Not) to Sum the Natural Numbers: Zeta Function Regularization

1+2+3+4+5+6+\ldots=-\frac{1}{12}

If you follow Numberphile on YouTube or Bad Astronomy on Slate you’ve already seen this counter-intuitive sum written out. Similarly, if you follow those people or Sciencetopia’s Good Math, Bad Math, you’re aware that the way that sum was presented by Numberphile in that video was seriously flawed.

There is a real sense in which adding up all of the natural numbers (numbers 1, 2, 3…) really does give you minus twelve, despite all the reasons this should be impossible. However, there is also a real sense in which it does not, and cannot, do any such thing. To explain this, I’m going to introduce two concepts: complex analysis and regularization.

This discussion is not going to be mathematically rigorous, but it should give an authentic and accurate view of where these results come from. If you’re interested in the full mathematical details, a later discussion by Numberphile should help, and the mathematically confident should read Terence Tao’s treatment from back in 2010.

With that said, let’s talk about sums! Well, one sum in particular:

\frac{1}{1^s}+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\frac{1}{5^s}+\frac{1}{6^s}+\ldots = \zeta(s)

If s is greater than one, then each term in this infinite sum gets smaller and smaller fast enough that you can add them all up and get a number. That number is referred to as \zeta(s), the Riemann Zeta Function.

So what if s is smaller than one?

The infinite sum that I described doesn’t converge for s less than one. Add it up in any reasonable way, and it just approaches infinity. Put another way, the sum is not properly defined. But despite this, \zeta(s) is not infinite for s less than one!

Now as you might object, we only defined the Riemann Zeta Function for s greater than one. How do we know anything at all about it for s less than one?

That is where complex analysis comes in. Complex analysis sounds like a made-up term for something unreasonably complicated, but it’s quite a bit more approachable when you know what it means. Analysis is the type of mathematics that deals with functions, infinite series, and the basis of calculus. It’s often contrasted with Algebra, which usually considers mathematical concepts that are discrete rather than smooth (this definition is a huge simplification, but it’s not very relevant to this post). Complex means that complex analysis deals with functions, not of everyday real numbers, but of complex numbers, or numbers with an imaginary part.

So what does complex analysis say about the Riemann Zeta Function?

One of the most impressive results of complex analysis is the discovery that if a function of a complex number is sufficiently smooth (the technical term is analytic) then it is very highly constrained. In particular, if you know how the function behaves over an area (technical term: open set), then you know how it behaves everywhere else!

If you’re expecting me to explain why this is true, you’ll be disappointed. This is serious mathematics, and serious mathematics isn’t the sort of thing you can give the derivation for in a few lines. It takes as much effort and knowledge to replicate a mathematical result as it does to replicate many lab results in science.

What I can tell you is that this sort of approach crops up in many places, and is part of a general theme. There is a lot you can tell about a mathematical function just by looking at its behavior in some limited area, because mathematics is often much more constrained than it appears. It’s the same sort of principle behind the work I’ve been doing recently.

In the case of the Riemann Zeta Function, we have a definition for s greater than one. As it turns out, this definition still works if s is a complex number, as long as the real part of s is greater than one. Using this information, the value of the Riemann Zeta Function for a large area (half of the complex numbers), complex analysis tells us its value for every other number. In particular, it tells us this:

\zeta(-1)= -\frac{1}{12}

If the Riemann Zeta Function is consistently defined for every complex number, then it must have this value when s is minus one.

If we still trusted the sum definition for this value of s, we could plug in -1 and get

 1+2+3+4+5+6+\ldots=-\frac{1}{12}

Does that make this statement true? Sort of. It all boils down to a concept from physics called regularization.

In physics, we know that in general there is no such thing as infinity. With a few exceptions, nothing in nature should be infinite, and finite evidence (without mathematical trickery) should never lead us to an infinite conclusion.

Despite this, occasionally calculations in physics will give infinite results. Almost always, this is evidence that we are doing something wrong: we are not thinking hard enough about what’s really going on, or there is something we don’t know or aren’t taking into account.

Doing physics research isn’t like taking a physics class: sometimes, nobody knows how to do the problem correctly! In many cases where we find infinities, we don’t know enough about “what’s really going on” to correct them. That’s where regularization comes in handy.

Regularization is the process by which an infinite result is replaced with a finite result (made “regular”), in a way so that it keeps the same properties. These finite results can then be used to do calculations and make predictions, and so long as the final predictions are regularization independent (that is, the same if you had done a different regularization trick instead) then they are legitimate.

In string theory, one way to compute the required dimensions of space and time ends up giving you an infinite sum, a sum that goes 1+2+3+4+5+…. In context, this result is obviously wrong, so we regularize it. In particular, we say that what we’re really calculating is the Riemann Zeta Function, which we happen to be evaluating at -1. Then we replace 1+2+3+4+5+… with -1/12.

Now remember when I said that getting infinities is a sign that you’re doing something wrong? These days, we have a more rigorous way to do this same calculation in string theory, one that never forces us to take an infinite sum. As expected, it gives the same result as the old method, showing that the old calculation was indeed regularization independent.

Sometimes we don’t have a better way of doing the calculation, and that’s when regularization techniques come in most handy. A particular family of tricks called renormalization is quite important, and I’ll almost certainly discuss it in a future post.

So can you really add up all the natural numbers and get -1/12? No. But if a calculation tells you to add up all the natural numbers, and it’s obvious that the result can’t be infinite, then it may secretly be asking you to calculate the Riemann Zeta Function at -1. And that, as we know from complex analysis, is indeed -1/12.

Astrophysics, the Impossible Science

Last week, Nobel Laureate Martinus Veltman gave a talk at the Simons Center. After the talk, a number of people asked him questions about several things he didn’t know much about, including supersymmetry and dark matter. After deflecting a few such questions, he proceeded to go on a brief rant against astrophysics, professing suspicion of the field’s inability to do experiments and making fun of an astrophysicist colleague’s imprecise data. The rant was a rather memorable feat of curmudgeonliness, and apparently typical Veltman behavior. It left several of my astrophysicist friends fuming. For my part, it inspired me to write a positive piece on astrophysics, highlighting something I don’t think is brought up enough.

The thing about astrophysics, see, is that astrophysics is impossible.

Imagine, if you will, an astrophysical object. As an example, picture a black hole swallowing a star.

Are you picturing it?

Now think about where you’re looking from. Chances are, you’re at some point up above the black hole, watching the star swirl around, seeing something like this:

Where are you in this situation? On a spaceship? Looking through a camera on some probe?

Astrophysicists don’t have spaceships that can go visit black holes. Even the longest-ranging probes have barely left the solar system. If an astrophysicist wants to study a black hole swallowing a star, they can’t just look at a view like that. Instead, they look at something like this:

The image on the right is an artist’s idea of what a black hole looks like. The three on the left? They’re what the astrophysicist actually sees. And even that is cleaned up a bit, the raw output can be even more opaque.

A black hole swallowing a star? Just a few blobs of light, pixels on screen. You can measure brightness and dimness, filter by color from gamma rays to radio waves, and watch how things change with time. You don’t even get a whole lot of pixels for distant objects. You can’t do experiments, either, you just have to wait for something interesting to happen and try to learn from the results.

It’s like staring at the static on a TV screen, day after day, looking for patterns, until you map out worlds and chart out new laws of physics and infer a space orders of magnitude larger than anything anyone’s ever experienced.

And naively, that’s just completely and utterly impossible.

And yet…and yet…and yet…it works!

Crazy people staring at a screen can’t successfully make predictions about what another part of the screen will look like. They can’t compare results and hone their findings. They can’t demonstrate principles (like General Relativity) that change technology here on Earth. Astrophysics builds on itself, discovery by discovery, in a way that can only be explained by accepting that it really does work (a theme that I’ve had occasion to harp on before).

Physics began with astrophysics. Trying to explain the motion of dots in a telescope and objects on the ground with the same rules led to everything we now know about the world. Astrophysics is hard, arguably impossible…but impossible or not, there are people who spend their lives successfully making it work.

What does Copernicus have to say about String Theory?

Putting aside some highly controversial exceptions, string theory has made no testable predictions. Conceivably, a world governed by string theory and a world governed by conventional particle physics would be indistinguishable to every test we could perform today. Furthermore, it’s not even possible to say that string theory predicts the same things with fewer fudge-factors, as string theory descriptions of our world seem to have dramatically many more free parameters than conventional ones.

Critics of string theory point to this as a reason why string theory should be excluded from science, sent off to the chilly arctic wasteland of the math department. (No offense to mathematicians, I’m sure your department is actually quite warm and toasty.) What these critics are missing is an important feature of the scientific process: before scientists are able to make predictions, they propose explanations.

To explain what I mean by that, let’s go back to the beginning of the 16th century.

At the time, the authority on astronomy was still Ptolemy’s Syntaxis Mathematica, a book so renowned that it is better known by the Arabic-derived superlative Almagest, “the greatest”. Ptolemy modeled the motions of the planets and stars as a series of interlocking crystal spheres with the Earth at the center, and did so well enough that until that time only minor improvements on the model had been made.

This is much trickier than it sounds, because even in Ptolemy’s day astronomers could tell that the planets did not move in simple circles around the Earth. There were major distortions from circular motion, the most dramatic being the phenomenon of retrograde motion.

If the planets really were moving in simple circles around the Earth, you would expect them to keep moving in the same direction. However, ancient astronomers saw that sometimes, some of the planets moved backwards. The planet would slow down, turn around, go backwards a bit, then come to a stop and turn again.

Thus sparking the invention of the spirograph.

In order to take this into account, Ptolemy introduced epicycles, extra circles of motion for the planets. The epicycle would move on the planet’s primary circle, or deferent, and the planet would rotate around the epicycle, like so:

French Wikipedia had a better picture.

These epicycles weren’t just for retrograde motion, though. They allowed Ptolemy to model all sorts of irregularities in the planets’ motions. Any deviation from a circle could conceivably be plotted out by adding another epicycle (though Ptolemy had other methods to model this sort of thing, among them something called an equant). Enter Copernicus.

Enter Copernicus’s hair.

Copernicus didn’t like Ptolemy’s model. He didn’t like equants, and what’s more, he didn’t like the idea that the Earth was the center of the universe. Like Plato, he preferred the idea that the center of the universe was a divine fire, a source of heat and light like the Sun. He decided to put together a model of the planets with the Sun in the center. And what he found, when he did, was an explanation for retrograde motion.

In Copernicus’s model, the planets always go in one direction around the Sun, never turning back. However, some of the planets are faster than the Earth, and some are slower. If a planet is slower than the Earth and it passes by it will look like it is going backwards, due to the Earth’s speed. This is tricky to visualize, but hopefully the picture below will help: As you can see in the picture, Mars starts out ahead of Earth in its orbit, then falls behind, making it appear to move backwards.

Despite this simplification, Copernicus still needed epicycles. The planets’ motions simply aren’t perfect circles, even around the Sun. After getting rid of the equants from Ptolemy’s theory, Copernicus’s model ended up having just as many epicycles as Ptolemy’s!

Copernicus’s model wasn’t any better at making predictions (in fact, due to some technical lapses in its presentation, it was even a little bit worse). It didn’t have fewer “fudge factors”, as it had about the same number of epicycles. If you lived in the 16th century, you would have been completely justified in believing that the Earth was the center of the universe, and not the Sun. Copernicus had failed to establish his model as scientific truth.

However, Copernicus had still done something Ptolemy didn’t: he had explained retrograde motion. Retrograde motion was a unique, qualitative phenomenon, and while Ptolemy could include it in his math, only Copernicus gave you a reason why it happened.

That’s not enough to become the reigning scientific truth, but it’s a damn good reason to pay attention. It was justification for astronomers to dedicate years of their lives to improving the model, to working with it and trying to get unique predictions out of it. It was enough that, over half a century later, Kepler could take it and turn it into a theory that did make predictions better than Ptolemy, that did have fewer fudge-factors.

String theory as a model of the universe doesn’t make novel predictions, it doesn’t have fewer fudge factors. What it does is explain, explaining spectra of particles in terms of shapes of space and time, the existence of gravity and light in terms of closed and open strings, the temperature of black holes in terms of what’s going on inside them (this last really ought to be the subject of its own post, it’s one of the big triumphs of string theory). You don’t need to accept it as scientific truth. Like Copernicus’s model in his day, we don’t have the evidence for that yet. But you should understand that, as a powerful explanation, the idea of string theory as a model of the universe is worth spending time on.

Of course, string theory is useful for many things that aren’t modeling the universe. But that’s the subject of another post.