Tag Archives: DoingScience

Numerics, or, Why can’t you just tell the computer to do it?

When most people think of math, they think of the math they did in school: repeated arithmetic until your brain goes numb, followed by basic algebra and trig. You weren’t allowed to use calculators on most tests for the simple reason that almost everything you did could be done by a calculator in a fraction of the time.

Real math isn’t like that. Mathematicians handle proofs and abstract concepts, definitions and constructions and functions and generally not a single actual number in sight. That much, at least, shouldn’t be surprising.

What might be surprising is that even tasks which seem very much like things computers could do easily take a fair bit of human ingenuity.

In physics, I do a lot of integrals. For those of you unfamiliar with calculus, integrals can be thought of as the area between a curve and the x-axis.

Areas seem like the sort of thing it would be easy for a computer to find. Chop the space into little rectangles, add up all the rectangles under the curve, and if your rectangles are small enough you should get the right answer. Broadly, this is the method of numerical integration. Since computers can do billions of calculations per second, you can chop things up into billions of rectangles and get as close as you’d like, right?

Heck, ten is a lot. Can we just do ten?

Heck, ten is a lot. Can we just do ten?

For some curves, this works fine. For others, though…

Ten might not be enough for this one.

Ten might not be enough for this one.

See how the left side of that plot goes off the chart? That curve goes to infinity. No matter how many rectangles you put on that side, you still won’t have any that are infinitely tall, so you’ll still miss that part of the curve.

Surprisingly enough, the area under this curve isn’t infinite. Do the integral correctly, and you get a result of 2. Set a computer to calculate this integral via the sort of naïve numerical integration discussed above though, and you’ll never find that answer. You need smarter methods: smart humans doing the math, or smart humans programming the computer.

Another way this can come up is if you’re adding up two parts of something that go to infinity in opposite directions. Try to integrate each part by itself and you’ll be stuck.

firstplot

secondplot

But add them together, and you get something quite a bit more tractable.

Yeah, definitely a ten-rectangle job.

Yeah, definitely a ten-rectangle job.

Numerical integration, and computers in general, are a very important tool in a scientist’s arsenal. But in order to use them, we have to be smart, and know what we’re doing. Knowing how to use our tools right can take almost as much expertise and care as working without tools.

So no, I can’t just tell the computer to do it.

“Super” Computers: Using a Cluster

When I join a new department or institute, the first thing I ask is “do we have a cluster?”

Most of what I do, I do on a computer. Gone are the days when theorists would always do all their work on notepads and chalkboards (though many still do!). Instead, we use specialized computer programs like Mathematica and Maple. Using a program helps keep us from forgetting pesky minus signs, and it allows working with equations far too long to fit on a sheet of paper.

Now if computers help, more computer should help more. Since physicists like to add “super” to things, what about a supercomputer?

The Jaguars of the computing world.

Supercomputers are great, but they’re also expensive. The people who use supercomputers are the ones who model large, complicated systems, like the weather, or supernovae. For most theorists, you still want power, but you don’t need quite that much. That’s where computer clusters come in.

A computer cluster is pretty much what it sounds like: several computers wired together. Different clusters contain different numbers of computers. For example, my department has a ten-node cluster. Sure, that doesn’t stack up to a supercomputer, but it’s still ten times as fast as an ordinary computer, right?

The power of ten computers!

The power of ten computers!

Well, not exactly. As several of my friends have been surprised to learn, the computers on our cluster are actually slower than most of our laptops.

The power of ten old computers!

The power of ten old computers!

Still, ten older computers is still faster than one new one, yes?

Even then, it depends how you use it.

Run a normal task on a cluster, and it’s just going to run on one of the computers, which, as I’ve said, are slower than a modern laptop. You need to get smarter.

There are two big advantages of clusters: time, and parallelization.

Sometimes, you want to do a calculation that will take a long time. Your computer is going to be busy for a day or two, and that’s inconvenient when you want to do…well, pretty much anything else. A cluster is a space to run those long calculations. You put the calculation on one of the nodes, you go back to doing your work, and you check back in a day or two to see if it’s finished.

Clusters are at their most powerful when you can parallelize. If you need to do ten versions of the same calculation, each slightly different, then rather than doing them one at a time a cluster lets you do them all at once. At that point, it really is making you ten times faster.

If you ever program, I’d encourage you to look into the resources you have available. A cluster is a very handy thing to have access to, no matter what you’re doing!

Why we Physics

There are a lot of good reasons to study theories in theoretical physics, even the ones that aren’t true. They teach us how to do calculations in other theories, including those that do describe reality, which lets us find out fundamental facts about nature. They let us hone our techniques, developing novel methods that often find use later, in some cases even spinoff technology. (Mathematica came out of the theoretical physics community, while experimental high energy physics led to the birth of the modern internet.)

Of course, none of this is why physicists actually do physics. Sure, Nima Arkani-Hamed might need to tell himself that space-time is doomed to get up in the morning, but for a lot of us, it isn’t about proving any wide-ranging point about the universe. It’s not even all about the awesome, as some would have it: most of what we do on a day-to-day basis isn’t especially awesome. It goes a bit deeper than that.

Science, in the end, is about solving puzzles. And solving puzzles is immensely satisfying, on a deep, fundamental level.

There’s a unique feeling that you get when all the pieces come together, when you’re calculating something and everything cancels and you’re left with a simple answer, and for some people that’s the best thing in existence.

It’s especially true when you’re working with an ansatz or using some other method where you fix parameters and fill in uncertainties, one by one. You can see how close you are to the answer, which means each step gives you that little thrill of getting just that much closer. One of my colleagues describes the calculations he does in supergravity as not tedious but “delightful” for precisely this reason: a calculation where every step puts another piece in the right place just feels good.

Theoretical physicists are the kind of people who would get a Lego set for their birthday, build it up to completion, and then never play with it again (unless it was to take it apart and make something else). We do it for the pure joy of seeing something come together and become complete. Save what it’s “for” for the grant committees, we’ve got a different rush in mind.

The Royal We of Theoretical Physics

I’m about to show you an abstract from a theoretical physics paper. Don’t worry about what it says, just observe the grammar.

wittenabstract

Notice anything? Here, I’ll zoom in:

wittenwe

This paper has one author, Edward Witten. So who’s “we”?

As it turns out, it is actually quite common in theoretical physics for a paper to use the word “we”, even when it is written by a single author. While this tradition has been called stilted, pompous, and just plain bad writing, there is a legitimate reason behind it. “We” is convenient, because it represents several different important things.

While the paper I quoted was written by only one author, many papers are collaborative efforts. For a collaboration, depending on collaboration style, it is often hard to distinguish who did what in a consistent way. As such, “we” helps smooth over different collaboration styles in a consistent way.

What about single-authored papers, though? For a single author, and often even for multiple authors, “we” means the author plus the reader.

In principle, anyone reading a paper in theoretical physics should be able to follow along, doing the calculations on their own, and replicate the paper’s results. In practice this can often be difficult to impossible, but it’s still true that if you want to really retain what you read in theoretical physics, you need to follow along and do some of the calculation yourself. As a nod to this, it is conventional to write theoretical physics papers as if the reader was directly participating, leading them through the results point by point like exercises in a textbook. “We” do one calculation, then “we” use the result to derive the next point, and so on.

There are other meanings that “we” can occasionally serve, such as referring to everyone in a particular field, or a group in a hypothetical example.

While each of these meanings of “we” could potentially use a different word, that tends to make a paper feel cluttered, with jarring transitions between different subjects. Using “we” for everything gives the paper a consistent voice and feel, though it does come at the cost of obscuring some of the specific details of who did what. Especially for collaborations, the “we the collaborators” and “we the author plus reader” meanings can overlap and blur together. This usually isn’t a problem, but as I’ve been finding out recently it does make things tricky when writing for people who aren’t theoretical physicists, such as universities with guidelines that require a thesis to clearly specify who in a collaboration did what.

On an unrelated note, two papers went up this week pushing the hexagon function story to new and impressive heights. I wasn’t directly involved in either, I’ve been attacking a somewhat different part of the problem, and you can look forward to something on that in a few months.

What’s in a Thesis?

As I’ve mentioned before, I’m graduating this spring, which means I need to write that most foreboding of documents, the thesis. As I work on it, I’ve been thinking about how the nature of the thesis varies from field to field.

If you don’t have much experience with academics, you probably think of a thesis as a single, overarching achievement that structures a grad student’s career. A student enters grad school, designs an experiment, performs it, collects data, analyzes the data, draws some conclusion, then writes a thesis about it and graduates.

In some fields, the thesis really does work that way. In biology for example, the process of planning an experiment, setting it up, and analyzing and writing up the data can be just the right size so that, a reasonable percentage of the time, it really can all be done over the course of a PhD.

Other fields tend more towards smaller, faster-paced projects. In theoretical physics, mathematics, and computer science, most projects don’t have the same sort of large experimental overhead that psychologists or biologists have to deal with. The projects I’ve worked on are large-scale for theoretical physics, and I’ll still likely have worked on three distinct things before I graduate. Others, with smaller projects, will often have covered more.

In this situation, a thesis isn’t one overarching idea. Rather, it’s a compilation of work from past projects, sewed together with a pretense of an overall theme. It’s a bit messy, but because it’s the way things are expected to be done in these fields, no-one minds particularly much.

The other end of the spectrum is potentially much harder to deal with. For those who work on especially big experiments, the payoff might take longer to arrive than any reasonable degree. Big machines like colliders and particle detectors can take well over a decade before they start producing data, while longitudinal studies that follow a population as they grow and age take a long time no matter how fast you work.

In cases like this, the challenge is to chop off a small enough part of the project to make it feel like a thesis. A thesis could be written about designing one component for the eventual machine, or analyzing one part of the vast sea of data it produces. Preliminary data from a longitudinal study could be analyzed, even when the final results are many years down the line.

People in these fields have to be flexible and creative when it comes to creating a thesis, but usually the thesis committee is reasonable. In the end, a thesis is what you need to graduate, whatever that actually is for you.

Four Gravitons and a…Postdoc?

As a few of you already know, it’s looking increasingly certain that I will be receiving my Ph.D. in the spring. I’ll graduate, ceasing to be a grad student and becoming that most mysterious of academic entities, a postdoc.

When describing graduate school before, I compared it to an apprenticeship. (I expanded on that analogy more here.) Let’s keep pursuing that analogy. If a graduate student is like an apprentice, then a Postdoctoral Scholar, or Postdoc, is like a journeyman.

In Medieval Europe, once an apprenticeship was completed the apprentice was permitted to work independently, earning a wage for their own labors. However, they still would not have their own shop. Instead, they would work for a master craftsman. Such a person was called a journeyman, after the French work journée, meaning a day’s work.

Similarly, once a graduate student gets their Ph.D., they are able to do scientific research independently. However, most graduate students are not ready to be professors when fresh out of their Ph.D. Instead, they become postdocs, working in an established professor’s group. Like a journeyman, a postdoc is nominally independent, but in practice works under loose supervision from the more mature members of their field.

Another similarity between postdocs and journeymen is their tendency to travel. Historically, a journeyman would spend several years traveling, studying in the workshops of several masters. Similarly, a postdoc will often (especially in today’s interconnected world) travel far from where they began in order to broaden their capabilities.

A postdoctoral job generally lasts two or three years, one for particularly short positions. Most scientists will go through at least one postdoctoral position after achieving their Ph.D. In some fields (theoretical physics in particular), a scientist will have two or three such positions in different places before finding a job as a professor. Postdocs are paid significantly better than grad students, but generally significantly worse than professors. They don’t (typically) teach, but depending on the institution and field they may do some TA work.

Being still a grad student, my blog is titled “4 gravitons and a grad student”. That could change, though. Once I become a postdoc, I have three options:

  1. Keep the old title. Keeping the same title and domain name makes it easier for people to find the blog. It also maintains the alliteration, which I think is fun. On the other hand, it would be hard to justify, and I’d likely have to write something silly about taking a grad student perspective or the like.
  2. Change to “4 gravitons and a postdoc”. I’d lose the fun alliteration, but the title would accurately represent my current state. However, I might lose a few readers who don’t expect the change.
  3. Cut it down to “4 gravitons”. This matches the blog’s twitter handle (@4gravitons). It’s quick, it’s recognizable, and it keeps the memorable part of the old title without adding anything new to remember. However, it would be less unique in google searches.

What do you folks think? I’ve still got a while to decide, and I’d love to hear your opinions!

The Amplitudes Revolution Will Not Be Televised (But It Will Be Streamed)

I’ve been at the Simons Center’s workshop on the Geometry and Physics of Scattering Amplitudes all week, so I don’t have time for a long post. There have been a lot of great talks from a lot of great amplitudes-folks (including one on Tuesday by Lance Dixon discussing this work, and one on the same day explaining the much-hyped amplituhedron). Curious folks can follow the conference link above to find videos and slides for each of the talks, arranged by the talk schedule.

I’ve made some great contacts, picked up a couple running jokes (check out Rutger Boels’s talk on Monday and Lance’s talk on Tuesday), heard the phrase “only seven loops” stated in relative seriousness, and heard the story of why the conference ended up choosing an artist’s conception of the amplituhedron for the workshop poster, which I can relate if folks are especially curious.

Elegance, Not So Mysterious

You’ll often hear theoretical physicists in the media referring to one theory or another as “elegant”. String theory in particular seems to get this moniker fairly frequently.

It may often seem like mathematical elegance is some sort of mysterious sixth sense theorists possess, as inexplicable to the average person as color to a blind person. What’s “elegant” about string theory, after all?

Before explaining elegance, I should take a bit of time to say what it’s not. Elegance isn’t Occam’s razor. It isn’t naturalness, either. Both of those concepts have their own technical definitions.

Elegance, by contrast, is a much hazier, and yet much simpler, notion. It’s hazy enough that any definition could provoke arguments, but I can at least give you an approximate idea by telling you that an elegant theory is simple to describe, if you know the right terms. Often, it is simpler than the phenomenon that it explains.

How does this apply to something like string theory? String theory seems to be incredibly complicated: ten dimensions, curled up in a truly vast number of different ways, giving rise to whole spectrums of particles.

That said, the basic idea is quite simple. String theory asks the question: what if, in addition to fundamental point-particles (zero dimensional objects), there were fundamental objects of other dimensions? That idea leads to complicated consequences: if your theory is going to produce all the particles of the real world then you need the ten dimensions and the supersymmetry and yadda yadda. But the basic idea is simple to describe. An elegant theory can have very complicated consequences, but still be simple to describe.

This, broadly, is the sort of explanation theoretical physicists look for. Math is the kind of field where the same basic systems can describe very complex phenomena. Since theoretical physics is about describing the world in terms of math, the right explanation is usually the most elegant.

This can occasionally trip physicists up when they migrate to other careers. In biology, for example, the elegant solution is often not the right one, because evolution doesn’t care about elegance: evolution just grabs whatever is within reach. Financial systems and economics occasionally have similar problems. All this is to say that while elegance is an important thing for a physicist to strive for, sometimes we have to be careful about it.

Where are the Amplitudeologists?

As I’ve mentioned a couple of times before, I’m part of a sub-field of theoretical physics called Amplitudeology.

Amplitudeology in its modern incarnation is relatively new, and concentrated in a few specific centers. I thought it might be interesting to visualize which universities have amplitudeologists, so I took a look at the attendee lists of two recent conferences and put their affiliations into google maps. In an attempt to balance things, one of the conferences is in North America and the other is in Europe. Here is the result:

The West Coast of the US has two major centers, Stanford/SLAC and UCLA, focused around Lance Dixon and Zvi Bern respectively. The Northeast has a fair assortment, including places that have essentially everything like the Perimeter Institute and the Institute for Advanced Study and places known especially for their amplitudes work like Brown.

Europe has quite a large number of places. There are many universities in Europe with a long history of technical research into quantum field theory. When amplitudes began to become more prominent as its own sub-field, many of these places slotted right in. In particular, there are many locations in Germany, a decent number in the UK, a few in the vicinity of CERN, and a variety of places of some importance elsewhere.

Outside of Europe and North America, there’s much less amplitudes research going on. Physics in general is a very international enterprise, and many sub-fields have a lot of participation from researchers in China, India, Japan, and Korea. Amplitudes, for the most part, hasn’t caught on in those places yet.

This map is just a result of looking at two conferences. More data would yield many places that were left out of this setup, including a longstanding community in Russia. Still, it gives you a rough idea of where to find amplitudeologists, should you have need of one.

High Energy? What does that mean?

I am a high energy physicist who uses the high energy and low energy limits of a theory that, while valid up to high energies, is also a low-energy description of what at high energies ends up being string theory (string theorists, of course, being high energy physicists as well).

If all of that makes no sense to you, congratulations, you’ve stumbled upon one of the worst-kept secrets of theoretical physics: we really could use a thesaurus.

“High energy” means different things in different parts of physics. In general, “high” versus “low” energy classifies what sort of physics you look at. “High” energy physics corresponds to the very small, while “low” energies encompass larger structures. Many people explain this via quantum mechanics: the uncertainty principle says that the more certain you are of a particle’s position, the less certain you can be of how fast it is going, which would imply that a particle that is highly restricted in location might have very high energy. You can also understand it without quantum mechanics, though: if two things are held close together, it generally has to be by a powerful force, so the bond between them will contain more energy. Another perspective is in terms of light. Physicists will occasionally use “IR”, or infrared, to mean “low energy” and “UV”, or ultraviolet, to mean “high energy”. Infrared light has long wavelengths and low energy photons, while ultraviolet light has short wavelengths and high energy photons, so the analogy is apt. However, the analogy only goes so far, since “UV physics” is often at energies much greater than those of UV light (and the same sort of situation applies for IR).

So what does “low energy” or “high energy” mean? Well…

The IR limit: Lowest of the “low energy” points, this refers to the limit of infinitely low energy. While you might compare it to “absolute zero”, really it just refers to energy that’s so low that compared to the other energies you’re calculating with it might as well be zero. This is the “low energy limit” I mentioned in the opening sentence.

Low energy physics: Not “high energy physics”. Low energy physics covers everything from absolute zero up to atoms. Once you get up to high enough energy to break up the nucleus of an atom, you enter…

High energy physics: Also known as “particle physics”, high energy physics refers to the study of the subatomic realm, which also includes objects which aren’t technically particles like strings and “branes”. If you exclude nuclear physics itself, high energy physics generally refers to energies of a mega-electron-volt and up. For comparison, the electrons in atoms are bound by energies of around an electron-volt, which is the characteristic energy of chemistry, so high energy physics is at least a million times more energetic. That said, high energy physicists are often interested in low energy consequences of their theories, including all the way down to the IR limit. Interestingly, by this point we’ve already passed both infrared light (from a thousandth of an electron-volt to a single electron volt) and ultraviolet light (several electron-volts to a hundred or so). Compared to UV light, mega-electron volt scale physics is quite high energy.

The TeV scale: If you’re operating a collider though, mega-electron-volts (or MeV) are low-energy physics. Often, calculations for colliders will assume that quarks, whose masses are around the MeV scale, actually have no mass at all! Instead, high energy for particle colliders means giga (billion) or tera (trillion) electron volt processes. The LHC, for example, operates at around 7 TeV now, with 14 TeV planned. This is the range of scales where many had hoped to see supersymmetry, but as time has gone on results have pushed speculation up to higher and higher energies. Of course, these are all still low energy from the perspective of…

The string scale: Strings are flexible, but under enormous tension that keeps them very very short. Typically, strings are posed to be of length close to the Planck length, the characteristic length at which quantum effects become relevant for gravity. This enormously small length corresponds to the enormously large Planck energy, which is on the order of 1028 electron-volts. That’s about ten to the sixteen times the energies of the particles at the LHC, or ten to the twenty-two times the MeV scale that I called “high energy” earlier. For comparison, there are about ten to the twenty-two atoms in a milliliter of water. When extra dimensions in string theory are curled up, they’re usually curled up at this scale. This means that from a string theory perspective, going to the TeV scale means ignoring the high energy physics and focusing on low energy consequences, which is why even the highest mass supersymmetric particles are thought of as low energy physics when approached from string theory.

The UV limit: Much as the IR limit is that of infinitely low energy, the UV limit is the formal limit of infinitely high energy. Again, it’s not so much an actual destination, as a comparative point where the energy you’re considering is much higher than the energy of anything else in your calculation.

These are the definitions of “high energy” and “low energy”, “UV” and “IR” that one encounters most often in theoretical particle physics and string theory. Other parts of physics have their own idea of what constitutes high or low energy, and I encourage you to ask people who study those parts of physics if you’re curious.