Gravity is Yang-Mills Squared

There’s a concept that I’ve wanted to present for quite some time. It’s one of the coolest accomplishments in my subfield, but I thought that explaining it would involve too much technical detail. However, the recent BICEP2 results have brought one aspect of it to the public eye, so I’ve decided that people are ready.

If you’ve been following the recent announcements by the BICEP2 telescope of their indirect observation of primordial gravitational waves, you’ve probably seen the phrases “E-mode polarization” and “B-mode polarization” thrown around. You may even have seen pictures, showing that light in the cosmic microwave background is polarized differently by quantum fluctuations in the inflaton field and by quantum fluctuations in gravity.

But why is there a difference? What’s unique about gravitational waves that makes them different from the other waves in nature?

As it turns out, the difference all boils down to one statement:

Gravity is Yang-Mills squared.

This is both a very simple claim and a very subtle one, and it comes up in many many places in physics.

Yang-Mills, for those who haven’t read my older posts, is a general category that contains most of the fundamental forces. Electromagnetism, the strong nuclear force, and the weak nuclear force are all variants of Yang-Mills forces.

Yang-Mills forces have “spin 1”. Another way to say this is that Yang-Mills forces are vector forces. If you remember vectors from math class, you might remember that a vector has a direction and a strength. This hopefully makes sense: forces point in a direction, and have a strength. You may also remember that vectors can also be described in terms of components. A vector in four space-time dimensions has four components: x, y, z, and time, like so:

\left( \begin{array}{c} x \\ y \\ z \\ t \end{array} \right)

Gravity has “spin 2”.

As I’ve talked about before, gravity bends space and time, which means that it modifies the way you calculate distances. In practice, that means it needs to be something that can couple two vectors together: a matrix, or more precisely, a tensor, like so:

\left( \begin{array}{cccc} xx & xy & xz & xt\\ yx & yy & yz & yt\\ zx & zy & zz & zt\\ tx & ty & tz & tt\end{array} \right)

So while a Yang-Mills force has four components, gravity has sixteen. Gravity is Yang-Mills squared.

(Technical note: gravity actually doesn’t use all sixteen components, because it’s traceless and symmetric. However, often when studying gravity’s quantum properties theorists often add on extra fields to “complete the square” and fill in the remaining components.)

There’s much more to the connection than that, though. For one, it appears in the kinds of waves the two types of forces can create.

In order to create an electromagnetic wave you need a dipole, a negative charge and a positive charge at opposite ends of a line, and you need that dipole to change over time.

Change over time, of course, is a property of Gifs.

Gravity doesn’t have negative and positive charges, it just has one type of charge. Thus, to create gravitational waves you need not a dipole, but a quadrupole: instead of a line between two opposite charges, you have four gravitational charges (masses) arranged in a square. This creates a “breathing” sort of motion, instead of the back-and-forth motion of electromagnetic waves.

This is your brain on gravitational waves.

This is why gravitational waves have a different shape than electromagnetic waves, and why they have a unique effect on the cosmic microwave background, allowing them to be spotted by BICEP2. Gravity, once again, is Yang-Mills squared.

But wait there’s more!

So far, I’ve shown you that gravity is the square of Yang-Mills, but not in a very literal way. Yes, there are lots of similarities, but it’s not like you can just square a calculation in Yang-Mills and get a calculation in gravity, right?

Well actually…

In quantum field theory, calculations are traditionally done using tools called Feynman diagrams, organized by how many loops the diagram contains. The simplest diagrams have no loops, and are called tree diagrams.

Fascinatingly, for tree diagrams the message of this post is as literal as it can be. Using something called the Kawai-Lewellen-Tye relations, the result of a tree diagram calculation in gravity can be found just by taking a similar calculation in Yang-Mills and squaring it.

(Interestingly enough, these relations were originally discovered using string theory, but they don’t require string theory to work. It’s yet another example of how string theory functions as a laboratory to make discoveries about quantum field theory.)

Does this hold beyond tree diagrams? As it turns out, the answer is again yes!
The calculation involved is a little more complicated, but as discovered by Zvi Bern, John Joseph Carrasco, and Henrik Johansson, if you can get your calculation in Yang-Mills into the right format then all you need to do is square the right thing at the right step to get gravity, even for diagrams with loops!

zvi-bern-350

carrasco

This trick, called BCJ duality after its discoverers, has allowed calculations in quantum gravity that far outpace what would be possible without it. In N=8 supergravity, the gravity analogue of N=4 super Yang-Mills, calculations have progressed up to four loops, and have revealed tantalizing hints that the uncontrolled infinities that usually plague gravity theories are absent in N=8 supergravity, even without adding in string theory. Results like these are why BCJ duality is viewed as one of the “foundational miracles” of the field for those of us who study scattering amplitudes.

Gravity is Yang-Mills squared, in more ways than one. And because gravity is Yang-Mills squared, gravity may just be tame-able after all.

Flexing the BICEP2 Results

The physicsverse has been abuzz this week with news of the BICEP2 experiment’s observations of B-mode polarization in the Cosmic Microwave Background.

There are lots of good sources on this, and it’s not really my field, so I’m just going to give a quick summary before talking about a few aspects I find interesting.

BICEP2 is a telescope in Antarctica that observes the Cosmic Microwave Background, light left over from the first time that the universe was clear enough for light to travel. (If you’re interested in a background on what we know about how the universe began, Of Particular Significance has an article here that should be fairly detailed, and I have a take on some more speculative aspects here.) Earlier experiments that observed the Cosmic Microwave Background discovered a surprising amount of uniformity. This led to the proposal of a concept called inflation: the idea that at some point the early universe expanded exponentially, smearing any non-uniformities across the sky and smoothing everything out. Since the rate the universe expands is a number, if that number is to vary it naturally should be a scalar field, which in this case is called the inflaton.

During inflation, distances themselves get stretched out. Think about inflation like enlarging an image. As you’ve probably noticed (maybe even in early posts on this blog), enlarging an image doesn’t always work out well. The resulting image is often pixelated or distorted. Some of the distortion comes from glitches in the program that enlarges the image, while some of it is just what happens when the pixels of the original image get enlarged to the point that you can see them.

Enlarging the Cosmic Microwave Background

Quantum fluctuations in the inflaton field itself are the glitches in the program, enlarging some areas more than others. The pattern they create in the Cosmic Microwave Background is called E-mode polarization, and several other experiments have been able to detect it.

Much weaker are the effect of the “pixels” of the original image. Since the original image is spacetime itself, the pixels are the quantum fluctuations of spacetime: quantum gravity waves. Inflation enlarged them to the point that they were visible on a large-distance scale, fundamental non-uniformity in the world blown up big enough to affect the distribution of light. The effect this had on light is detectably different: it’s called B-mode polarization, and this is the first experiment to detect it on the right scale for it to be caused by gravity waves.

Measuring this polarization, in particular how strong it is, tells us a lot about how inflation occurred. It’s enough to rule out several models, and lend support to several others. If the results are corroborated this will be real, useful evidence, the sort physicists love to get, and folks are happily crunching numbers on it all over the world.

All that said, this site is called four gravitons and a grad student, and I’m betting that some of you want to ask this grad student: is this evidence for gravitons, or for gravity waves?

Sort of.

We already had good indirect evidence for gravity waves: pairs of neutron stars release gravity waves as they orbit each other, which causes them to slow down. Since we’ve observed them slowing down at the right rates, we were already confident gravity waves exist. And if you’ve got gravity waves, gravitons follow as a natural consequence of quantum mechanics.

The data from BICEP2 is also indirect. The gravity waves “observed” by BICEP2 were present in the early universe. It is their effect on the light that would become the Cosmic Microwave Background that is being observed, not the gravity waves directly. We still have yet to directly detect gravity waves, with a gravity telescope like LIGO.

On the other hand, a “gravity telescope” isn’t exactly direct either. In order to detect gravity waves, LIGO and other gravity telescopes attempt to measure their effect on the distances between objects. How do they do that? By looking at interference patterns of light.

In both cases, we’re looking at light, present in the environment of a gravity wave, and examining its properties. Of course, in a gravity telescope the light is from a nearby environment under tight control, while the Cosmic Microwave Background is light from as far away and long ago as anything within the reach of science today. In both cases, though, it’s not nearly as simple as “observing” an effect. “Seeing” anything in high energy physics or astrophysics is always a matter of interpreting data based on science we already know.

Alright, that’s evidence for gravity waves. Does that mean evidence for gravitons?

I’ve seen a few people describe BICEP2’s results as evidence for quantum gravity/quantum gravity effects. I felt a little uncomfortable with that claim, so I asked Matt Strassler what he thought. I think his perspective on this is the right one. Quantum gravity is just what happens when gravity exists in a quantum world. As I’ve said on this site before, quantum gravity is easy. The hard part is making a theory of quantum gravity that has real predictive power, and that’s something these results don’t shed any light on at all.

That said, I’m a bit conflicted. They really are seeing a quantum effect in gravity, and as far as I’m aware this really is the first time such an effect has been observed. Gravity is so weak, and quantum gravity effects so small, that it takes inflation blowing them up across the sky for them to be visible. Now, I don’t think there was anyone out there who thought gravity didn’t have quantum fluctuations (or at least, anyone with a serious scientific case). But seeing into a new regime, even if it doesn’t tell us much…that’s important, isn’t it? (After writing this, I read Matt Strassler’s more recent post, where he has a paragraph professing similar sentiments).

On yet another hand, I’ve heard it asserted in another context that loop quantum gravity researchers don’t know how to get gravitons. I know nothing about the technical details of loop quantum gravity, so I don’t know if that actually has any relevance here…but it does amuse me.

“Super” Computers: Using a Cluster

When I join a new department or institute, the first thing I ask is “do we have a cluster?”

Most of what I do, I do on a computer. Gone are the days when theorists would always do all their work on notepads and chalkboards (though many still do!). Instead, we use specialized computer programs like Mathematica and Maple. Using a program helps keep us from forgetting pesky minus signs, and it allows working with equations far too long to fit on a sheet of paper.

Now if computers help, more computer should help more. Since physicists like to add “super” to things, what about a supercomputer?

The Jaguars of the computing world.

Supercomputers are great, but they’re also expensive. The people who use supercomputers are the ones who model large, complicated systems, like the weather, or supernovae. For most theorists, you still want power, but you don’t need quite that much. That’s where computer clusters come in.

A computer cluster is pretty much what it sounds like: several computers wired together. Different clusters contain different numbers of computers. For example, my department has a ten-node cluster. Sure, that doesn’t stack up to a supercomputer, but it’s still ten times as fast as an ordinary computer, right?

The power of ten computers!

The power of ten computers!

Well, not exactly. As several of my friends have been surprised to learn, the computers on our cluster are actually slower than most of our laptops.

The power of ten old computers!

The power of ten old computers!

Still, ten older computers is still faster than one new one, yes?

Even then, it depends how you use it.

Run a normal task on a cluster, and it’s just going to run on one of the computers, which, as I’ve said, are slower than a modern laptop. You need to get smarter.

There are two big advantages of clusters: time, and parallelization.

Sometimes, you want to do a calculation that will take a long time. Your computer is going to be busy for a day or two, and that’s inconvenient when you want to do…well, pretty much anything else. A cluster is a space to run those long calculations. You put the calculation on one of the nodes, you go back to doing your work, and you check back in a day or two to see if it’s finished.

Clusters are at their most powerful when you can parallelize. If you need to do ten versions of the same calculation, each slightly different, then rather than doing them one at a time a cluster lets you do them all at once. At that point, it really is making you ten times faster.

If you ever program, I’d encourage you to look into the resources you have available. A cluster is a very handy thing to have access to, no matter what you’re doing!

A Wild Infinity Appears! Or, Renormalization

Back when Numberphile’s silly video about the zeta function came up, I wrote a post explaining the process of regularization, where physicists take an incorrect infinite result and patch it over to get something finite. At the end of that post I mentioned a particular variant of regularization, called renormalization, which was especially important in quantum field theory.

Renormalization has to do with how we do calculations and make predictions in particle physics. If you haven’t read my post “What’s so hard about Quantum Field Theory anyway?” you should read it before trying to tackle this one. The important concepts there are that probabilities in particle physics are calculated using Feynman Diagrams, that those diagrams consist of lines representing particles and points representing the ways they interact, that each line and point in the diagram gives a number that must be plugged in to the calculation, and that to do the full calculation you have to add up all the possible diagrams you can draw.

Let’s say you’re interested in finding out the mass of a particle. How about the Higgs?

You can’t weigh it, or otherwise see how gravity affects it: it’s much too light, and decays into other particles much too fast. Luckily, there is another way. As I mentioned in this post, a particle’s mass and its kinetic energy (energy of motion) both contribute to its total energy, which in turn affects what particles it can turn into if it decays. So if you want to find a particle’s mass, you need the relationship between its motion and its energy.

Suppose we’ve got a Higgs particle moving along. We know it was created out of some collision, and we know what it decays into at the end. With that, we can figure out its mass.

higgstree

There’s a problem here, though: we only know what happens at the beginning and the end of this diagram. We can’t be certain what happens in the middle. That means we need to add in all of the other diagrams, every possible diagram with that beginning and that end.

Just to look at one example, suppose the Higgs particle splits into a quark and an anti-quark (the antimatter version of the quark). If they come back together later into a Higgs, the process would look the same from the outside. Here’s the diagram for it:

higgsloop

When we’re “measuring the Higgs mass”, what we’re actually measuring is the sum of every single diagram that begins with the creation of a Higgs and ends with it decaying.

Surprisingly, that’s not the problem!

The problem comes when you try to calculate the number that comes out of that diagram, when the Higgs splits into a quark-antiquark pair. According to the rules of quantum field theory, those quarks don’t have to obey the normal relationship between total energy, kinetic energy, and mass. They can have any kinetic energy at all, from zero all the way up to infinity. And because it’s quantum field theory, you have to add up all of those possible kinetic energies, all the way up. In this case, the diagram actually gives you infinity.

(Note that not every diagram with unlimited kinetic energy is going to be infinite. The first time theorists calculated infinite diagrams, they were surprised.

For those of you who know calculus, the problem here comes after you integrate over momentum. The two quarks each give a factor of one over the momentum, and then you integrate the result four times (for three dimensions of space plus time), which gives an infinite result. If you had different particles arranged in a different way you might divide by more factors of momentum and get a finite value.)

The modern understanding of infinite results like this is that they arise from our ignorance. The mass of the Higgs isn’t actually infinity, because we can’t just add up every kinetic energy up to infinity. Instead, at some point before we get to infinity “something else” happens.

We don’t know what that “something else” is. It might be supersymmetry, it might be something else altogether. Whatever it is, we don’t know enough about it now to include it in the calculations as anything more than a cutoff, a point beyond which “something” happens. A theory with a cutoff like this, one that is only “effective” below a certain energy, is called an Effective Field Theory.

While we don’t know what happens at higher energies, we still need a way to complete our calculations if we want to use them in the real world. That’s where renormalization comes in.

When we use renormalization, we bring in experimental observations. We know that, no matter what is contributing to the Higgs particle’s mass, what we observe in the real world is finite. “Something” must be canceling the divergence, so we simply assume that “something” does, and that the final result agrees with the experiment!

"Something"

“Something”

In order to do this, we accepted the experimental result for the mass of the Higgs. That means that we’ve lost any ability to predict the mass from our theory. This is a general rule for renormalization: we trade ignorance (of the “something” that happens at high energy) for a loss of predictability.

If we had to do this for every calculation, we couldn’t predict anything at all. Luckily, for many theories (called renormalizable theories) there are theorems proving that you only need to do this a few times to fix the entire theory. You give up the ability to predict the results of a few experiments, but you gain the ability to predict the rest.

Luckily for us, the Standard Model is a renormalizable theory. Unfortunately, some important theories are not. In particular, quantum gravity is non-renormalizable. In order to fix the infinities in quantum gravity, you need to do the renormalization trick an infinite number of times, losing an infinite amount of predictability. Thus, while making a theory of quantum gravity is not difficult in principle, in practice the most obvious way to create the theory results in a “theory” that can never make any predictions.

One of the biggest virtues of string theory (some would say its greatest virtue) is that these infinities never appear. You never need to renormalize string theory in this way, which is what lets it work as a theory of quantum gravity. N=8 supergravity, the gravity cousin of N=4 super Yang-Mills, might also have this handy property, which is why many people are so eager to study it.

Why we Physics

There are a lot of good reasons to study theories in theoretical physics, even the ones that aren’t true. They teach us how to do calculations in other theories, including those that do describe reality, which lets us find out fundamental facts about nature. They let us hone our techniques, developing novel methods that often find use later, in some cases even spinoff technology. (Mathematica came out of the theoretical physics community, while experimental high energy physics led to the birth of the modern internet.)

Of course, none of this is why physicists actually do physics. Sure, Nima Arkani-Hamed might need to tell himself that space-time is doomed to get up in the morning, but for a lot of us, it isn’t about proving any wide-ranging point about the universe. It’s not even all about the awesome, as some would have it: most of what we do on a day-to-day basis isn’t especially awesome. It goes a bit deeper than that.

Science, in the end, is about solving puzzles. And solving puzzles is immensely satisfying, on a deep, fundamental level.

There’s a unique feeling that you get when all the pieces come together, when you’re calculating something and everything cancels and you’re left with a simple answer, and for some people that’s the best thing in existence.

It’s especially true when you’re working with an ansatz or using some other method where you fix parameters and fill in uncertainties, one by one. You can see how close you are to the answer, which means each step gives you that little thrill of getting just that much closer. One of my colleagues describes the calculations he does in supergravity as not tedious but “delightful” for precisely this reason: a calculation where every step puts another piece in the right place just feels good.

Theoretical physicists are the kind of people who would get a Lego set for their birthday, build it up to completion, and then never play with it again (unless it was to take it apart and make something else). We do it for the pure joy of seeing something come together and become complete. Save what it’s “for” for the grant committees, we’ve got a different rush in mind.

The Royal We of Theoretical Physics

I’m about to show you an abstract from a theoretical physics paper. Don’t worry about what it says, just observe the grammar.

wittenabstract

Notice anything? Here, I’ll zoom in:

wittenwe

This paper has one author, Edward Witten. So who’s “we”?

As it turns out, it is actually quite common in theoretical physics for a paper to use the word “we”, even when it is written by a single author. While this tradition has been called stilted, pompous, and just plain bad writing, there is a legitimate reason behind it. “We” is convenient, because it represents several different important things.

While the paper I quoted was written by only one author, many papers are collaborative efforts. For a collaboration, depending on collaboration style, it is often hard to distinguish who did what in a consistent way. As such, “we” helps smooth over different collaboration styles in a consistent way.

What about single-authored papers, though? For a single author, and often even for multiple authors, “we” means the author plus the reader.

In principle, anyone reading a paper in theoretical physics should be able to follow along, doing the calculations on their own, and replicate the paper’s results. In practice this can often be difficult to impossible, but it’s still true that if you want to really retain what you read in theoretical physics, you need to follow along and do some of the calculation yourself. As a nod to this, it is conventional to write theoretical physics papers as if the reader was directly participating, leading them through the results point by point like exercises in a textbook. “We” do one calculation, then “we” use the result to derive the next point, and so on.

There are other meanings that “we” can occasionally serve, such as referring to everyone in a particular field, or a group in a hypothetical example.

While each of these meanings of “we” could potentially use a different word, that tends to make a paper feel cluttered, with jarring transitions between different subjects. Using “we” for everything gives the paper a consistent voice and feel, though it does come at the cost of obscuring some of the specific details of who did what. Especially for collaborations, the “we the collaborators” and “we the author plus reader” meanings can overlap and blur together. This usually isn’t a problem, but as I’ve been finding out recently it does make things tricky when writing for people who aren’t theoretical physicists, such as universities with guidelines that require a thesis to clearly specify who in a collaboration did what.

On an unrelated note, two papers went up this week pushing the hexagon function story to new and impressive heights. I wasn’t directly involved in either, I’ve been attacking a somewhat different part of the problem, and you can look forward to something on that in a few months.

Caltech Amplitudes Workshop, and Valentines Poem 2014

This week’s post will be a short one. I’m at a small workshop for young amplitudes-folks at Caltech, so I’m somewhat busy.

(What we call a workshop is a small conference focused on fostering discussion and collaboration. While there are a few talks to give the workshop structure, most of the time is spent in more informal discussions between the participants.)

There have been a lot of great talks, and a lot of great opportunities to bond with fellow young amplitudeologists. Also, great workshop swag!

Yes, that is a Hot Wheels Mars Rover

Yes, that is a Hot Wheels Mars Rover

Unrelatedly, to continue a tradition from last year, and since it’s Valentine’s Day, allow me to present a short physics-themed poem I wrote a long time ago, this one about the sometimes counter-intuitive laws of thermodynamics:

Thermodynamic Hypothesis

A cold object, like a hot one, must be insulated

Cut off from interaction

Immerse the subject in a bath of warmth

And I reach equilibrium

What’s in a Thesis?

As I’ve mentioned before, I’m graduating this spring, which means I need to write that most foreboding of documents, the thesis. As I work on it, I’ve been thinking about how the nature of the thesis varies from field to field.

If you don’t have much experience with academics, you probably think of a thesis as a single, overarching achievement that structures a grad student’s career. A student enters grad school, designs an experiment, performs it, collects data, analyzes the data, draws some conclusion, then writes a thesis about it and graduates.

In some fields, the thesis really does work that way. In biology for example, the process of planning an experiment, setting it up, and analyzing and writing up the data can be just the right size so that, a reasonable percentage of the time, it really can all be done over the course of a PhD.

Other fields tend more towards smaller, faster-paced projects. In theoretical physics, mathematics, and computer science, most projects don’t have the same sort of large experimental overhead that psychologists or biologists have to deal with. The projects I’ve worked on are large-scale for theoretical physics, and I’ll still likely have worked on three distinct things before I graduate. Others, with smaller projects, will often have covered more.

In this situation, a thesis isn’t one overarching idea. Rather, it’s a compilation of work from past projects, sewed together with a pretense of an overall theme. It’s a bit messy, but because it’s the way things are expected to be done in these fields, no-one minds particularly much.

The other end of the spectrum is potentially much harder to deal with. For those who work on especially big experiments, the payoff might take longer to arrive than any reasonable degree. Big machines like colliders and particle detectors can take well over a decade before they start producing data, while longitudinal studies that follow a population as they grow and age take a long time no matter how fast you work.

In cases like this, the challenge is to chop off a small enough part of the project to make it feel like a thesis. A thesis could be written about designing one component for the eventual machine, or analyzing one part of the vast sea of data it produces. Preliminary data from a longitudinal study could be analyzed, even when the final results are many years down the line.

People in these fields have to be flexible and creative when it comes to creating a thesis, but usually the thesis committee is reasonable. In the end, a thesis is what you need to graduate, whatever that actually is for you.

Editors, Please Stop Misquoting Hawking

If you’ve been following science news recently, you’ve probably heard the apparently alarming news that Steven Hawking has turned his back on black holes, or that black holes can actually be escaped, or…how about I just show you some headlines:

FoxHawking

NatureHawking

YahooHawking

Now, Hawking didn’t actually say that black holes don’t exist, but while there are a few good pieces on the topic, in many cases the real message has gotten lost in the noise.

From Hawking’s paper:

ActualPaperHawking

What Hawking is proposing is that the “event horizon” around a black hole, rather than being an absolute permanent boundary from which nothing can escape, is a more temporary “apparent” horizon, the properties of which he goes on to describe in detail.

Why is he proposing this? It all has to do with the debate over black hole firewalls.

Starting with a paper by Polchinski and colleagues a year and a half ago, the black hole firewall paradox centers on contradictory predictions from general relativity and quantum mechanics. General relativity predicts that an astronaut falling past a black hole’s event horizon will notice nothing particularly odd about the surrounding space, but that once past the event horizon none of the “information” that specifies the astronaut’s properties can escape to the outside world. Quantum mechanics on the other hand predicts that information cannot be truly lost. The combination appears to suggest something radical, a “firewall” of high energy radiation around the event horizon carrying information from everything that fell into the black hole in the past, so powerful that it would burn our hypothetical astronaut to a crisp.

Since then, a wide variety of people have made one proposal or another, either attempting to avoid the seemingly preposterous firewall or to justify and further explain it. The reason the debate is so popular is because it touches on some of the fundamental principles of quantum mechanics.

Now, as I have pointed out before, I’m not a good person to ask about the fundamental principles of quantum mechanics. (Incidentally, I’d love it if some of the more quantum information or general relativity-focused bloggers would take a more substantial crack at this! Carroll, Preskill, anyone?) What I can talk about, though, is hype.

All of the headlines I listed take Hawking’s quote out of context, but not all of the articles do. The problem isn’t so much the journalists, as the editors.

One of an editor’s responsibilities is to take articles and give them titles that draw in readers. The editor wants a title that will get people excited, make them curious, and most importantly, get them to click. While a journalist won’t have any particular incentive to improve ad revenue, the same cannot be said for an editor. Thus, editors will often rephrase the title of an article in a way that makes the whole story seem more shocking.

Now that, in itself, isn’t a problem. I’ve used titles like that myself. The problem comes when the title isn’t just shocking, but misleading.

When I call astrophysics “impossible”, nobody is going to think I mean it literally. The title is petulant and ridiculous enough that no-one would take it at face value, but still odd enough to make people curious. By contrast, when you say that Hawking has “changed his mind” about black holes or said that “black holes do not exist”, there are people who will take that at face value as supporting their existing beliefs, as the Borowitz Report humorously points out. These people will go off thinking that Hawking really has given up on black holes. If the title confirms their beliefs enough, people might not even bother to read the article. Thus, by using an actively misleading title, you may actually be decreasing clicks!

It’s not that hard to write a title that’s both enough of a hook to draw people in and won’t mislead. Editors of the world, you’re well-trained writers, certainly much better than me. I’m sure you can manage it.

There really is some interesting news here, if people had bothered to look into it. The firewall debate has been going on for a year and a half, and while Hawking isn’t the universal genius the media occasionally depicts he’s still the world’s foremost expert on the quantum properties of black holes. Why did he take so long to weigh in? Is what he’s proposing even particularly new? I seem to remember people discussing eliminating the horizon in one way or another (even “naked” singularities) much earlier in the firewall debate…what makes Hawking’s proposal novel and different?

This is the sort of thing you can use to draw in interest, editors of the world. Don’t just write titles that cause ignorant people to roll their eyes and move on, instead, get people curious about what’s really going on in science! More ad revenue for you, more science awareness for us, sounds like a win-win!

How (Not) to Sum the Natural Numbers: Zeta Function Regularization

1+2+3+4+5+6+\ldots=-\frac{1}{12}

If you follow Numberphile on YouTube or Bad Astronomy on Slate you’ve already seen this counter-intuitive sum written out. Similarly, if you follow those people or Sciencetopia’s Good Math, Bad Math, you’re aware that the way that sum was presented by Numberphile in that video was seriously flawed.

There is a real sense in which adding up all of the natural numbers (numbers 1, 2, 3…) really does give you minus twelve, despite all the reasons this should be impossible. However, there is also a real sense in which it does not, and cannot, do any such thing. To explain this, I’m going to introduce two concepts: complex analysis and regularization.

This discussion is not going to be mathematically rigorous, but it should give an authentic and accurate view of where these results come from. If you’re interested in the full mathematical details, a later discussion by Numberphile should help, and the mathematically confident should read Terence Tao’s treatment from back in 2010.

With that said, let’s talk about sums! Well, one sum in particular:

\frac{1}{1^s}+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\frac{1}{5^s}+\frac{1}{6^s}+\ldots = \zeta(s)

If s is greater than one, then each term in this infinite sum gets smaller and smaller fast enough that you can add them all up and get a number. That number is referred to as \zeta(s), the Riemann Zeta Function.

So what if s is smaller than one?

The infinite sum that I described doesn’t converge for s less than one. Add it up in any reasonable way, and it just approaches infinity. Put another way, the sum is not properly defined. But despite this, \zeta(s) is not infinite for s less than one!

Now as you might object, we only defined the Riemann Zeta Function for s greater than one. How do we know anything at all about it for s less than one?

That is where complex analysis comes in. Complex analysis sounds like a made-up term for something unreasonably complicated, but it’s quite a bit more approachable when you know what it means. Analysis is the type of mathematics that deals with functions, infinite series, and the basis of calculus. It’s often contrasted with Algebra, which usually considers mathematical concepts that are discrete rather than smooth (this definition is a huge simplification, but it’s not very relevant to this post). Complex means that complex analysis deals with functions, not of everyday real numbers, but of complex numbers, or numbers with an imaginary part.

So what does complex analysis say about the Riemann Zeta Function?

One of the most impressive results of complex analysis is the discovery that if a function of a complex number is sufficiently smooth (the technical term is analytic) then it is very highly constrained. In particular, if you know how the function behaves over an area (technical term: open set), then you know how it behaves everywhere else!

If you’re expecting me to explain why this is true, you’ll be disappointed. This is serious mathematics, and serious mathematics isn’t the sort of thing you can give the derivation for in a few lines. It takes as much effort and knowledge to replicate a mathematical result as it does to replicate many lab results in science.

What I can tell you is that this sort of approach crops up in many places, and is part of a general theme. There is a lot you can tell about a mathematical function just by looking at its behavior in some limited area, because mathematics is often much more constrained than it appears. It’s the same sort of principle behind the work I’ve been doing recently.

In the case of the Riemann Zeta Function, we have a definition for s greater than one. As it turns out, this definition still works if s is a complex number, as long as the real part of s is greater than one. Using this information, the value of the Riemann Zeta Function for a large area (half of the complex numbers), complex analysis tells us its value for every other number. In particular, it tells us this:

\zeta(-1)= -\frac{1}{12}

If the Riemann Zeta Function is consistently defined for every complex number, then it must have this value when s is minus one.

If we still trusted the sum definition for this value of s, we could plug in -1 and get

 1+2+3+4+5+6+\ldots=-\frac{1}{12}

Does that make this statement true? Sort of. It all boils down to a concept from physics called regularization.

In physics, we know that in general there is no such thing as infinity. With a few exceptions, nothing in nature should be infinite, and finite evidence (without mathematical trickery) should never lead us to an infinite conclusion.

Despite this, occasionally calculations in physics will give infinite results. Almost always, this is evidence that we are doing something wrong: we are not thinking hard enough about what’s really going on, or there is something we don’t know or aren’t taking into account.

Doing physics research isn’t like taking a physics class: sometimes, nobody knows how to do the problem correctly! In many cases where we find infinities, we don’t know enough about “what’s really going on” to correct them. That’s where regularization comes in handy.

Regularization is the process by which an infinite result is replaced with a finite result (made “regular”), in a way so that it keeps the same properties. These finite results can then be used to do calculations and make predictions, and so long as the final predictions are regularization independent (that is, the same if you had done a different regularization trick instead) then they are legitimate.

In string theory, one way to compute the required dimensions of space and time ends up giving you an infinite sum, a sum that goes 1+2+3+4+5+…. In context, this result is obviously wrong, so we regularize it. In particular, we say that what we’re really calculating is the Riemann Zeta Function, which we happen to be evaluating at -1. Then we replace 1+2+3+4+5+… with -1/12.

Now remember when I said that getting infinities is a sign that you’re doing something wrong? These days, we have a more rigorous way to do this same calculation in string theory, one that never forces us to take an infinite sum. As expected, it gives the same result as the old method, showing that the old calculation was indeed regularization independent.

Sometimes we don’t have a better way of doing the calculation, and that’s when regularization techniques come in most handy. A particular family of tricks called renormalization is quite important, and I’ll almost certainly discuss it in a future post.

So can you really add up all the natural numbers and get -1/12? No. But if a calculation tells you to add up all the natural numbers, and it’s obvious that the result can’t be infinite, then it may secretly be asking you to calculate the Riemann Zeta Function at -1. And that, as we know from complex analysis, is indeed -1/12.